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(57) Abstract 



Isolated nucleic acid molecules encoding novel members of the p62 family of polypeptides, which include, in preferred embodiment, 
an SH2 binding domain and a ubiquitin binding domain, are described. Also disclosed are novel members of the pi 60 family of polypeptides. 
The p62 polypeptides and the pi 60 polypeptides of the invention are capable of modulating leukocyte activity, e.g., by stimulating a B 
cell response, including B cell proliferation, B cell aggregation, B cell differentiation, B cell survival, and/or stimulating a T cell response, 
e.g., T cell proliferation, T cell aggregation, T cell differentiation, and T cell survival, are disclosed. The p62 polypeptides and the 
pi 60 polypeptides of the invention are also capable of modulating ubiquitin-mediated degradation of cellular proteins. In addition to 
isolated nucleic acids molecules, antisense nucleic acid molecules, recombinant expression vectors containing a nucleic acid molecule of 
the invention, host cells into which the expression vectors have been introduced are also described. The invention further provides isolated 
p62 polypeptides and isolated pl60 polypeptides, fusion polypeptides and active fragments thereof. Diagnostic and therapeutic methods 
utilizing compositions of the invention are also provided. 
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p62 POLYPEPTIDES, RELATED POLYPEPTIDES, AND USES THEREFOR 

Background of the Invention 

Engagement of the T cell antigen receptor (TCR) by peptide antigen bound to the 
5 major histocompatibility complex (MHC) molecules initiates a biochemical cascade 
involving protein tyrosine kinases (PTKs) and protein tyrosine phosphatases (PTPases). 
Recent biochemical and genetic evidence has implicated at least three cytoplasmic 
PTKs, Lck, Fyn, and ZAP-70 that are involved in the initiation of TCR signal 
transduction. Chan, A.C et al. (1994) Annu. Rev. Immunol. 12:555-592. Lck and Fyn 

10 are members of the Src-family (Cooper, J. A. (1989) "The Src Family of Protein 

Tyrosine Kinases" In Peptides and Protein Phosphorylation ed. Kemp, B. and Alewood, 
P.F. (CRC Press, Boca Raton) pp. 85-1 13) and ZAP-70 is a member of the Syk-family. 
The Src-family PTKs share a number of common structural features including: (1) an N- 
terminal myristylated glycine at residue 2 that permits membrane localization; (2) a 

1 5 unique approximately 80 amino acid N-terminal region that may dictate specific 

associations of the kinase; (3) an approximately 60 amino acid Src-homology 3 (SH3) 
domain involved in interacting with signaling molecules with proline-rich regions 
(reviewed in Pawson, T. et al. (1992) Cell 2 1:359-362); (4) an approximately 100 amino 
acid Src-homology 2 (SH2) domain that can specifically mediate the recruitment of 

20 tyrosine phosphoproteins (reviewed in Pawson, T. et al. (1992) Cell 21 :359-362); (5) a 
C -terminal catalytic domain; and (6) a negative regulatory tyrosine residue C-terminal to 
the kinase domain. Chan, A.C. et al. (1994) Annu. Rev. Immunol. 12:555-592. 

Lck is a 56kDa lymphoid specific PTK that noncovalently associates with the 
cytoplasmic domains of CD4 and CD8 through cysteine-dependent interactions. Rudd, 

25 C.E. et al. (1988) Proc. Natl. Acad, Set. USA 85:5190-5194; Veillette, A. et al. (1988) 
Cell 55:301-308; Turner, J.M. et al. (1990) Cell 60:755-765; Shaw, A.S. et al. (1989) 
Cell 59:627-636; Shaw, A.S.etal. (1990) MoL Cell Biol. 10:1853-1862. The 
extracellular domains of CD4 and CD8 serve as TCR co-receptors by binding the 
monomorphic regions of MHC class II or I molecules, respectively, to stabilize the 

30 interaction between T cells and antigen presenting cells. Doyle, C. et al. (1988) Nature 
330:256-258; Norment, A.M. et al. (1 988) Nature 336:79-81. In addition to this 
stabilizing function, the association of CD4 and CD8 with Lck has also suggested a 
potential role in signal transduction for these TCR co-receptors. Veillette, A. et al. 
(1989) Nature 338:257-259. Specifically, the association of Lck and CD4 has been 

35 shown to be an essential, but not the only, requirement for co-receptor function in TCR 
signaling. Chan, A.C. et al. (]994) Annu. Rev. Immunol. 12:555-592. 
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Further evidence, in the form of genetic studies, has been derived to demonstrate 
the importance of Lck in both thymocyte development and TCR-mediated cell signaling. 
Chan, A.C et al. (1994) Annu. Rev. Immunol 12:555-592. For example, mice deficient 
in Lck, as a result of homologous recombination, have a pronounced arrest in thymocyte 
5 development with a 1 0-30 fold decrease in total thymocyte number. Molina, TJ. et al. 
( 1 992) Nature 357: 1 6 1 - 1 64. Whereas the double-negative (CD4"CD8-) thymocyte 
population was similar to normal littermates, there was a dramatic reduction in the 
double-positive (CD4 + CD8 + ) thymocyte population (10-60 fold) and no detectable 
single positive (CD4 + CD8 - and CD4"CD8 + ) thymocytes. A marked reduction also 

1 0 occurred in the number of peripheral T cells, though the few peripheral T cells were 
capable of mounting a diminished proliferative response to antibody-mediated cross- 
linking of the TCR. Thus, Lck appears to be critical for normal thymocyte development. 
Chan, A.C. et al. (1994) ,4 www. Rev, Immunol 12:555-592. 

The role of Lck in TCR-mediated signaling is further supported by results from 

1 5 two studies in which loss of a functional Lck protein abrogated TCR-mediated signaling. 
In the first study, a mutant of the Jurkat leukemic T cell line, J.CaML6, lacking a 
functional Lck PTK failed to mobilize calcium, to induce tyrosine phosphoproteins, or to 
express activation antigens following TCR stimulation. Straus, D. and Weiss, A. (1992) 
Celt 70:585-593. Reconstitution with wild-type murine Lck in this mutant restored all 

20 TCR-mediated functions. In the second study, a spontaneous variant of an IL-2- 
dependent cytotoxic T cell line lacking Lck also manifested a profound reduction in 
TCR-mediated cytolysis that was restored following Lck expression. Karnitz, L. et al. 
(1992) Mo/. Cell Biol 12:4521-4530. Both mutants demonstrated comparable levels of 
Fyn kinase activity relative to their parental counterparts. The fact that normal levels of 

25 other Src-family PTKs in these cells are unable to compensate for the Lck deficit 

demonstrates that Lck plays a critical role in TCR-mediated signal transduction. Chan, 
A.C. et al. (1994) Annu. Rev. Immunol 12:555-592. 

Further studies have yielded results which are consistent with the requirement for 
Lck in TCR-mediated signaling. Specifically, overexpression of an "activated" form of 

30 Lck(F505) in a CD4" negative murine T cell hybridoma resulted in enhanced antigen- 
induced IL-2 secretion and TCR-induced cellular tyrosine phosphoproteins. Abraham, 
N. et al. (1991) Nature 350:62-66. In addition, it has been shown through further 
analysis of the domains within Lck that participate in TCR function that membrane 
localization and the SH2 domain of Lck are both required. Caron, L. et al. (1992) Mol 

35 Cell Biol 12:2720-2729. Mutation of the N-terminal site of myristylation (thereby 
preventing membrane localization of Lck(F505)) or deletion of the SH2 domain of 
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Lck(F505) abolished the TCR-induced hyperresponsiveness as indicated by cellular 
tyrosine phosphorylation and antigen-induced 1L-2 production. In contrast, retroviral 
infection of T helper hybridoma cell lines with a temperature sensitive Lck(F505) 
resulted in antigen-independent IL-2 production at the permissive temperature. Luo, K. 
5 and Sefton, B.M (1992) Moi Cell Biol 12:4724-4732. In this system, while deletion of 
the SH2 domain abrogated antigen-independent IL-2 production, deletion of the SH3 
domain did not significantly alter IL-2 production. Thus, the SH2 domain may be 
required to interact with downstream effector molecules in propagating TCR function- 
Given the above-described studies, further information about the mechanisms and 
10 cellular components which regulate Lck function would offer potential new routes for 
modulating Lck/TCR-mediated cells signaling and lymphoid cell development and/or 
function. 

Summary of the Invention 

15 This invention is based, at least in part, on the discovery of a family of 

polypeptides, designated herein as p62 polypeptides, which share at least two 
structural/functional properties, at least one of which is relevant to Lck function. The 
p62 polypeptides include, for example, an SH2 binding domain, e.g., an SH2 binding 
domain which binds an SH2 domain of Lck independent of phosphotyrosine and a 

20 ubiquitin binding domain. 

Preferred p62 polypeptides of the present invention include several additional 
structural/functional domains such as a zinc finger domain, a GTPase binding domain, 
domains containing phosphorylation sites, a PEST domain, and an SH3 binding domain. 
p62 polypeptides within the scope of the invention are also characterized functionally 

25 by, for example, the ability to modulate T cell activity, e.g., T cell 

development/differentiation, T cell activation, lymphokine secretion; the ability to 
modulate B cell activity, e.g., B cell development/differentiation, B cell activation, 
antibody secretion; the ability to modulate ubiquitin-mediated degradation of cellular 
proteins; the p62 polypeptide modulates expression of cell cycle dependent kinase 

30 inhibitors, e.g., p2 1 CI P; the ability to bind to at least one polypeptide involved in the ras 
cell signaling cascade, e.g., pl20-GAP; the ability to bind to GTPase; the ability to 
modulate cell cycle progression; and the ability to modulate cell proliferation. 

The present invention also relates to a second family of polypeptides, designated 
herein as pi 60 polypeptides. The pi 60 polypeptides are related functionally to the p62 

35 polypeptides in that the pi 60 polypeptides bind to the p62/p56 lck complex to thereby 
modulate Lck function in a similar manner as described herein for the p62 polypeptides. 
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The pi 60 polypeptides activate transcription of a variety of genes upon, for example, 
activation of p62. The genes which are transcribed in response to pi 60 activation 
include those which are involved in T or B cell development/differentiation, T or B cell 
activation, and production of T or B cell-specific factors, e.g., lymphokines and 
5 antibodies, respectively. The pi 60 polypeptides of the present invention have also been 
found to be substrates for serine/threonine kinase activity. 

Accordingly, this invention pertains to isolated nucleic acid molecules encoding 
p62 polypeptides. Such nucleic acid molecules (e.g., cDNAs) have a nucleotide 
sequence encoding a p62 polypeptide (e.g., a human polypeptide) or biologically active 

1 0 portions or fragments thereof, such as a peptide having a p62 activity. In a preferred 
embodiment, the isolated nucleic acid molecule has a nucleotide sequence shown in 
Figure 1, SEQ ID NO:l, or a portion or fragment thereof, or a nucleotide sequence 
shown in Figure 3, SEQ ID NO:3, or a portion or fragment thereof. Preferred regions of 
these nucleotide sequences are the coding regions. Other preferred nucleic acid 

1 5 molecules are those which have at least about 60%, preferably at least about 70%, more 
preferably at least about 80%, and most preferably at least about 90%, 95%, 97% or 
98% or more overall nucleotide sequence identity with a nucleotide sequence shown in 
Figure 1 , SEQ ID NO:l , or a portion or fragment thereof, or a nucleotide sequence 
shown in Figure 3, SEQ ID NO:3, or a portion or fragment thereof. Nucleic acid 

20 molecules which hybridize under stringent conditions to the nucleotide sequence shown 
in Figure 1 , SEQ ID NO: 1 or the nucleotide sequence shown in Figure 3, SEQ ID NO:3 
are also within the scope of the invention. Portions or fragments of the nucleic acid 
molecules of the present invention are also specifically contemplated. Such portions or 
fragments include nucleotide sequences which encode, for example, polypeptide 

25 domains having a p62 activity. Examples of portions or fragments of nucleic acid 
molecules which encode such domains include portions or fragments of nucleotide 
sequences of Figure 1 , SEQ ID NO: 1 and of Figure 3, SEQ ID NO:3 which encode one 
or more of the following: aubiquitin binding domain; an SH2 binding domain; a zinc 
finger domain; at least one phosphorylation site; a GTPase binding domain; a PEST 

30 domain; and an SH3 domain. Particularly preferred nucleotide sequences encoding each 
of these domains are described herein. 

In another embodiment, the nucleic acid molecules of the invention encode a 
polypeptide having an amino acid sequence shown in Figure 2, SEQ ID NO:2, or a 
portion or fragment thereof having a biological activity, e.g., a p62 activity, or an amino 

35 acid sequence shown in Figure 4, SEQ ID NO:4, or a portion or fragment thereof having 
a p62 activity. Nucleic acid molecules encoding a polypeptide having at least about 
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60%, preferably at least about 70%, more preferably at least about 80%, and most 
preferably at least about 90%, 95%, 97% or 98% overall sequence identity with an 
amino acid sequence shown in Figure 2, SEQ ID NO:2, or a portion or fragment thereof 
having a biological activity, e.g., a p62 activity, or an amino acid sequence shown in 
5 Figure 4, SEQ ID NO:4, or a portion or fragment thereof having a biological activity, 
e.g., a p62 activity, are also within the scope of the invention. 

This invention further pertains to nucleic acid molecules which encode p62 
polypeptides which bind to ubiquitin, a ubiquitin analog, derivative or active fragment, 
and an SH2 domain. In a preferred embodiment, the p62 polypeptides bind an SH2 

10 domain having an amino acid sequence which has at least about 70%, more preferably at 
least about 80%, and most preferably at least about 90% or more (e.g., 95%, 97% or 
98%) sequence identity with an amino acid sequence of the SH2 domain of p56 lcJc . In 
one embodiment, the polypeptide binds to the SH2 domain of p56 lcJc as shown in Figure 
5, SEQ ID NO:5. The p62 polypeptides encoded by the nucleic acids of the present 

15 invention can also have one or more, in any combination, of various p62 activities. 

These activities include (1) the ability to bind to a Lck SH2 domain or Lck related SH2 
domain (i.e., an SH2 domain which comprises an amino acid sequence having at least 
about 70% sequence identity with the amino acid sequence of the SH2 domain of 
p56 lck ), preferably in a phosphotyrosine (pY)-independent manner; (2) the ability to 

20 bind to ubiquitin or a ubiquitin analog, derivative or active fragment thereof; (3) the 
ability to modulate (e.g., inhibit or stimulate) T cell development (e.g., differentiation) 
or T cell activation (e.g., lymphokine secretion); (4) the ability to modulate B cell 
development (e.g., differentiation) or B cell activation (e.g,, antibody secretion); (5) the 
ability to inhibit ubiquitin-mediated degradation of cellular proteins such as cell cycle 

25 regulatory proteins (e.g., p53); (6) the ability to modulate expression of cell cycle 

dependent kinase inhibitors, e.g., p21 cl P; (7) the ability to bind to proteins involved in 
the ras cell signaling cascade, e.g., pl20-GAP; (8) the ability to bind to GTPase; (9) the 
ability to modulate cell cycle progression, e.g., inhibit or arrest cell cycle progression at, 
for example, the Gl/S boundary; and (10) the ability to modulate (e.g., inhibit or 

30 stimulate) cell proliferation. 

Another aspect of the invention pertains to nucleic acid molecules which encode 
polypeptides which are fragments of at least about 20 amino acid residues in length, 
more preferably at least about 30 amino acid residues in length or more, of an amino 
acid sequence shown in Figure 2, SEQ ID NO:2 or an amino acid sequence shown in 

35 Figure 4, SEQ ID NO:4. Other aspects of the invention pertain to nucleic acid 

molecules which encode polypeptides which are fragments of at least about 20 amino 
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acid residues in length, more preferably at least about 30 amino acid residues in length 
which have at least about 70%, more preferably at least about 80%, and most preferably 
at least about 90% or more (e.g., 95%, 97-98%) overall sequence identity with an amino 
acid sequence shown in Figure 2, SEQ ID NO:2, or a portion or fragment thereof having 
5 a biological activity, e.g., a p62 activity, or an amino acid sequence shown in Figure 4, 
SEQ ID NO:4, or a portion or fragment thereof having a biological activity, e.g., a p62 
activity. Portions or fragments of the polypeptides encoded by the nucleic acids of the 
invention include polypeptide regions which comprise, for example, various structural 
and/or functional domains of p62. Such domains include portions or fragments of 

10 nucleotide sequences of Figure 1, SEQ ID NO:l and of Figure 3, SEQ ID NO:3 which 
encode one or more of the following: a ubiquitin binding domain; an SH2 binding 
domain; at least one phosphorylation site; a GTPase binding domain; a PEST domain; 
and an SH3 binding domain. The specific amino acid sequences of each these domains 
are described herein. Nucleic acid molecules which are antisense to the nucleic acid 

1 5 molecules described herein are also within the scope of the invention. 

Another aspect of the invention pertains to recombinant expression vectors 
containing the nucleic acid molecules of the invention and host cells into which such 
recombinant expression vectors have been introduced- In one embodiment, such a host 
cell is used to produce a p62 polypeptide by culturing the host cell in a suitable medium. 

20 If desired, a p62 polypeptide protein can be then isolated from the medium or the host 
cell. 

Still another aspect of the invention pertains to isolated p62 polypeptides (e.g., 
isolated human p62 polypeptides) and active fragments thereof, such as peptides having 
an activity of a p62 polypeptide (e.g., at least one biological activity of a p62 

25 polypeptide as described herein). The invention also provides an isolated or purified 
preparation of a p62 polypeptide. In preferred embodiments, a p62 polypeptide 
comprises an amino acid sequence of Figure 2, SEQ ID NO:2 or an amino acid sequence 
of Figure 4, SEQ ID NO:4. In other embodiments, the isolated p62 polypeptide 
comprises an amino acid sequence having at least 70%, more preferably 80%, and most 

30 preferably 90% (e.g., 95%, 97%-98%) or more overall sequence identity with an amino 
acid sequence of Figure 2, SEQ ID NO:2 or an amino acid sequence of Figure 4, SEQ 
ID NO:4 and, preferably has an activity of a p62 polypeptide (e.g., at least one biological 
activity of p62). 

This invention also pertains to isolated p62 polypeptides which bind to ubiquitin, 
35 a ubiquitin analog, derivative or active fragment, and an SH2 domain. In a preferred 
embodiment, the p62 polypeptides bind an SH2 domain having an amino acid sequence 
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which is at least about 70%, more preferably at least about 80%, and most preferably at 
least about 90% or more identical to an amino acid sequence of the SH2 domain of 
p56 lck . The binding of the SH2 binding domain to the SH2 domain can be 
phosphotyrosine independent. In one embodiment, the p62 polypeptides bind to the 
5 SH2 domain of p56 lck as shown in Figure 5, SEQ ID NO:5. In other preferred 

embodiments, the p62 polypeptide domain which binds ubiquitin, a ubiquitin analog, 
derivative or active fragment which has at least about 50% or more overall sequence 
identity with an amino acid sequence which includes amino acid residues 323 to 440 of 
Figure 2, SEQ ID NO:2 or amino acid residues 303 to 419 of Figure 4, SEQ ID NO:4. 

10 These peptides can optionally include a zinc finger domain, e.g., a zinc finger domain 
having an amino acid sequence which has at least about 50% or more overall sequence 
identity with an amino acid sequence which includes amino acid residues 128 to 163 of 
Figure 2, SEQ ID NO:2 or an amino acid sequence which includes amino acid residues 
108 to 143 of Figure 4, SEQ ID NO:4 and/or a GTPase binding domain, e.g., a GTPase 

1 5 binding domain having an amino acid sequence which has at least about 50% or more 
overall sequence identity with an amino acid sequence which includes amino acid 
residues 66 to 82 of Figure 2, SEQ ID NO:2 or an amino acid sequence which includes 
amino acid residues 46 to 62 of Figure 4, SEQ ID NO:4. 

Other optional domains which can be included in the peptides of the present 

20 invention include a PEST domain, e.g., a PEST domain having an amino acid sequence 
which has at least about 50% or more overall sequence identity with an amino acid 
sequence which includes amino acid residues 266 to 296 of Figure 2, SEQ ID NO:2 or 
an amino acid sequence which includes amino acid residues 246 to 276 of Figure 4, SEQ 
ID NO:4 and/or an SH3 binding domain, e.g., an SH3 binding domain having an amino 

25 acid sequence which has at least about 50% or more overall sequence identity with an 
amino acid sequence which includes amino acid residues 202 to 21 1 of Figure 2, SEQ 
ID NO:2 or an amino acid sequence which includes amino acid residues 1 83 to 191 of 
Figure 4, SEQ ID NO:4 and an SH3 domain. These isolated p62 polypeptides can have 
one or more, in any combination, of the p62 biological activities described herein. 

30 Fragments of the p62 polypeptides of the invention can include portions or 

fragments of the amino acid sequences shown in Figure 2, SEQ ID NO:2 or Figure 4, 
SEQ ID NO:4 which are at least about 20 amino acid residues, at least about 30, or at 
least about 40 or more amino acid residues in length. The peptide fragments preferably 
have a p62 activity and can be modified to impart desired characteristics thereon. For 

35 example, peptide fragments having a p62 activity can be modified for such purposes as 
increasing solubility, enhancing therapeutic or prophylactic efficacy, or stability (e.g., 
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shelf life ex vivo and resistance to proteolytic degradation in vivo). Such modified 
peptides are considered functional equivalents of peptides having an activity of p62 as 
defined herein. A modified peptide can be produced in which the amino acid sequence 
has been altered, such as by amino acid substitution, deletion, or addition, to modify a 
5 p62 activity, or to which a component has been added for the same purpose. The p62 
polypeptide portions or fragments described herein can have a p62 activity, e.g., one or 
more, in any combination, of the p62 biological activities described herein. Portions or 
fragments of the polypeptides of the invention can include polypeptide regions which 
comprise, for example, various structural and/or functional domains. Such domains 

1 0 include portions or fragments of amino acid sequences of Figure 2, SEQ ID NO:2 and of 
Figure 4, SEQ ID NO:4 which encode at least one of the following: a ubiquittn binding 
domain; an SH2 binding domain; a zinc finger domain; at least one phosphorylation site; 
a GTPase binding domain; a PEST domain; and an SH3 binding domain. Preferred 
amino acid sequences of each of these domains are described herein. 

1 5 The invention also provides for a p62 fusion polypeptide comprising a p62 

polypeptide and a second polypeptide portion having an amino acid sequence from a 
protein unrelated to an amino acid sequence selected from the group consisting of an 
amino acid sequence shown in Figure 2, SEQ ID NO:2 and an amino acid sequence 
shown in Figure 4, SEQ ID NO:4. In addition, a p62 polypeptide of the invention can be 

20 incorporated into a pharmaceutical composition which includes the polypeptide (or 
active portion thereof) and a pharmaceutical ly acceptable carrier. In addition, vaccine 
compositions which include a p62 polypeptide or a vector containing a nucleic acid 
molecule which encodes a p62 polypeptide are also within the scope of the invention. 
Antibodies, e.g., monoclonal or polyclonal antibodies, which bind to a p62 polypeptide 

25 or fragment thereof are also specifically contemplated in the present invention. 

The p62 polypeptides of the invention can be used to modulate, for example, 
leukocyte proliferation and/or activity in vitro or in vivo. In one embodiment, the 
invention provides a method for inhibiting cell proliferation in a subject, e.g., a mammal 
e.g., a human. This method includes administering to the subject a therapeutically 

30 effective amount of an agent which modulates p62 expression such that p62 expression 
is stimulated. Agents which modulate p62 expression can be used to inhibit cell 
proliferation which is, for example, associated with tumor formation and growth (i.e., 
neoplasia), e.g., cervical cancer, e.g., cervical cancer induced by human papilloma virus 
(HPV), e.g., HPV-1, HPV-2, HPV-3, HPV-4, HPV-5, HPV-6, HPV-7, HPV-8, HPV-9, 

35 HPV-10, HPV-1 1, HPV-12, HPV-14, HPV-13, HPV-15, HPV-16, HPV-17 or HPV-18, 
and particularly high-risk HPVs, such as HPV-16, HPV-18, HPV-31 and HPV-33. 
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Additional methods for inhibiting cell proliferation in a subject which are within the 
scope of the invention include administration to the subject of a therapeutically amount 
of a p62 polypeptide or fragment thereof or a vector comprising a nucleic acid molecule 
encoding a p62 polypeptide or fragment thereof. In another embodiment, the invention 
5 provides a method for promoting cell proliferation in a subject, e.g., a mammal, e.g., a 
human. This method can include administering to the subject a therapeutically effective 
amount of an agent which modulates p62 expression such that p62 expression is 
inhibited. Agents which modulate p62 expression can be used to promote cell 
proliferation in desired locations and in desired circumstances, e.g., to promote wound 

10 healing (e.g., skin cell growth) or hair growth. Other methods for promoting cell 
proliferation in a subject which are within the scope of the invention include 
administration to the subject of a therapeutically effective amount of an inhibitor of a 
p62 polypeptide such as a nucleic acid molecule which is antisense to a nucleic acid 
molecule encoding a p62 polypeptide or an antibody which binds a p62 polypeptide. 

1 5 The invention further provides methods for modulating T cell activity, e.g., T 

cell proliferation, differentiation, cytokine secretion, or B cell activity, e.g., B cell 
proliferation, differentiation, antibody secretion, in a subject comprising administering 
to the subject a therapeutically effective amount of an agent which modulates p62 
expression, or a therapeutically effective amount of an agent which activates or inhibits 

20 a p62 polypeptide. 

Additional methods of the invention include assays for identifying agents which 
inhibit or activate/stimulate a p62 polypeptide. Inhibitory or stimulatory agents 
identified according to these methods are within the scope of the invention. In one 
embodiment, for example, an agent which inhibits a p62 polypeptide can be identified 

25 by contacting a first polypeptide comprising an SH2 domain of p56 lck with a second 
polypeptide comprising a p62 polypeptide and an agent to be tested and then 
determining binding of the second polypeptide to the first polypeptide. Inhibition of 
binding of the first polypeptide to the second polypeptide indicates that the agent is an 
inhibitor of a p62 polypeptide while activation of binding of the first polypeptide to the 

30 second polypeptide indicates that the agent is an activator of a p62 polypeptide. 

Alternative methods for identifying an agent which inhibits or activates/stimulates a 
p62 polypeptide are also within the scope of the invention. For example, an alternative 
method for identifying an agent which inhibits or activates a p62 polypeptide includes 
35 contacting a p53 protein, p53 analog, derivative or active fragment, under conditions 
which promote ubiquitination of the p53 protein, p53 analog, derivative or active 
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fragment, with an agent to be tested and then determining p53 ubiquitination level in the 
presence of the agent. Activation of p53 ubiquitination indicates that the agent is an 
inhibitor of a p62 polypeptide while inhibition of p53 ubiquitination indicates that the 
agent is an activator/stimulator of a p62 polypeptide. 
5 Other alternative methods for identifying an agent which inhibits or 

activates/stimulates a p62 polypeptide are contemplated by the present invention. These 
methods include contacting a first polypeptide comprising ubiquitin, a ubiquitin analog, 
derivative or active fragment, with a second polypeptide comprising a p62 polypeptide 
and an agent to be tested and then determining binding of the second polypeptide to the 

10 first polypeptide. Inhibition of binding of the first polypeptide to the second polypeptide 
indicates that the agent is an inhibitor of a p62 polypeptide while activation/stimulation 
of binding of the first polypeptide to the second polypeptide indicates that the agent is an 
activator/stimulator or a p62 polypeptide. 

Still other alternative methods for identifying an agent which inhibits or 

15 activates/stimulates a p62 polypeptide are provided by the present invention. For 
example, another method for identifying an agent which inhibits a p62 polypeptide 
includes contacting a first polypeptide comprising p53 protein, p53 analog, derivative or 
active fragment, with a second polypeptide comprising a p62 polypeptide and an agent 
to be tested and then measuring the level of p53 degradation in the presence of the agent. 

20 If a comparison of the level of p53 degradation in the presence of the agent to the level 
of p53 degradation in the absence of the agent shows an increase in the level of p53 
degradation in the presence of the agent, the agent is an inhibitor of a p62 polypeptide. 
If a comparison of the level of p53 degradation in the presence of the agent to the level 
of p53 degradation in the absence of the agent shows a decrease in the level of p53 

25 degradation in the presence of the agent, the agent is an activator/stimulator of a p62 
polypeptide. 

Another aspect of the invention includes an isolated nucleic acid molecule 
comprising a nucleotide sequence encoding a pi 60 polypeptide. In a preferred 
embodiment, the nucleic acid sequence encoding a pi 60 polypeptide comprises a 

30 nucleotide sequence shown in Figure 8, SEQ ID NO:6 or in Figure 10, SEQ ID NO:7 or 
a nucleotide sequence encoding an amino acid sequence shown in Figure 9, SEQ ID 
NO:8 or Figure 1 1 , SEQ ID NO:9. 

Other aspects of the invention include isolated polypeptides having a pi 60 
activity. Examples of such polypeptides include polypeptides having an amino acid 

35 sequence shown in Figure 9, SEQ ID NO:8 or Figure 1 1 , SEQ ID NO:9 or a fragment 
thereof- 
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Still further aspects of the invention pertain to methods for modulating T cell 
activity, e.g., T cell proliferation, differentiation, cytokine secretion, or B cell activity, 
e.g., B cell proliferation, differentiation, antibody secretion, in a subject. These methods 
include administering to the subject a therapeutically effective amount of an agent which 
5 modulates pi 60 expression, or a therapeutically effective amount of an agent which 
activates or inhibits a pi 60 polypeptide. Also specifically contemplated by the present 
invention are methods for identifying agents which inhibit or activate/stimulate pi 60 
polypeptides. These methods include steps which are parallel to those described herein 
for methods of identifying agents which inhibit or activate/stimulate pi 60 polypeptides. 
10 Moreover, as the pi 60 polypeptides of the present invention are involved in the p62 
cellular regulatory activities described herein, the pi 60 polypeptides have similar 
applications and uses as the p62 polypeptides. 

Brief Description of the Drawings 

1 5 Figure 1 is the nucleotide sequence of an approximately 2. Ikb (2083bp) cDNA 

encoding a first full length human p62 polypeptide (SEQ ID NO:l). 

Figure 2 is the predicted full length amino acid sequence (440 amino acid 

residues) of the human p62 polypeptide (SEQ ID NO:2) encoded by the nucleotide 

sequence shown in Figure 1. 
20 Figure 3 is the nucleotide sequence of an approximately 2.0kb (1 977bp) cDNA 

encoding a second human p62 polypeptide (SEQ ID NO:3). 

Figure 4 is the predicted amino acid sequence (419 amino acid residues) of the 

human p62 polypeptide (SEQ ID NO:4) encoded by the nucleotide sequence shown in 

Figure 3. 

25 Figure 5 is the amino acid sequence of the SH2 domain of p56 Ic k (SEQ ID 

NO:5). 

Figure 6 is the nucleotide sequence (beginning at nucleotide 101 of SEQ ID 
NO: 1) encoding the first full length human p62 (top) aligned for comparison to the 
nucleotide sequence (SEQ ID NO:3) encoding the second human p62 polypeptide 
30 (bottom). The regions of identity are marked by lines connecting the identical 
nucleotides. 

Figure 7 is the amino acid sequence (SEQ ID NO:2) encoding the first full 
length human p62 (top) aligned for comparison to the amino acid sequence (SEQ ID 
NO:4) encoding the second human p62 polypeptide (bottom). The regions of identity are 
35 marked by lines connecting the identical amino acid residues. 



WO 97/22255 



PCT/US96/19944 



-12- 

Figure 8 is the nucleotide sequence of an approximately 3.9kb (3901bp) cDNA 
encoding a first full length human pi 60 polypeptide (pl60.1) (SEQ ID NO:6). 

Figure 9 is the predicted full length amino acid sequence (1135 amino acid 
residues) of the first human pi 60 polypeptide (pi 60.1) (SEQ ID NO:7) encoded by the 
5 nucleotide sequence shown in Figure 8. 

Figure 10 is the nucleotide sequence of an approximately 3.2kb (321 lbp) cDNA 
encoding a second full length human pi 60 polypeptide (pi 60.2) (SEQ ID NO:8). 

Figure 11 is the predicted full length amino acid sequence (905 amino acid 
residues) of the second human pl60 polypeptide (pl60.2) (SEQ ID NO:9) encoded by 
10 the nucleotide sequence shown in Figure 10. 

Figures 12A-12C depict the results of experiments demonstrating that p62 binds 
to the Lck SH2 domain in a phosphotyrosine independent manner. Figure 12A is a 
schematic representation of the construction of glutathione S-transferase (GST)-fusion 
proteins containing regions of p56 lck . Figure 12B is an autoradiograph of a 9% SDS- 
15 PAGE on which lysates from 35 S-methionine labelled HeLa cells incubated with GST 
and GST fusion proteins containing unique N-terminal region (1-77), unique N-terminal 
region and SH3 domain (1-123), and SH2 domain (1 19-224) were separated. A 62 kD 
protein (p62) that bound specifically to the SH2 domain is marked with an arrow. 
Figure 12C is a photograph of an SDS-PAGE on which lysates from 35 S-methionine 
20 labelled HeLa cells (which were lysed in the presence or absence of phosphatase 

inhibitors (NaVC>4 and NaF), protease inhibitors (PMSF and Leupeptin), or reducing 
reagent (DTT)) incubated with GST. 1 19-224 were analyzed. 

Figure 13 depicts the results of experiments demonstrating that the 
phosphotyrosine independent binding of p62 to the p56 lck SH2 domain is competed by 
25 specific phosphotyrosyl peptides. Figure 13 is an autoradiograph of a 9% SDS-PAGE 
on which lysates from 35 S-methionine labelled HeLa cells (which were lysed in the 
presence of phosphatase inhibitors (NaVC>4 and NaF)) incubated with increasing 
concentrations of phosphotyrosyl peptides (pY324, pY505, pY771, and pY536) were 
separated. 

30 Figures 14A-14B depict the results of experiments demonstrating distinct 

mechanisms for phosphotyrosine-dependent and -independent protein binding to the 
SH2 domain. Figure 14A is a photograph of an immunoblot on which GST alone, 
GST. 1 19-224, and GST.l 19-224.R154K incubated with v-src transfected HeLa cell 
lysate in the presence of phosphatase inhibitor were analyzed using an anti- 

35 phosphotyrosine antibody. Figure 14B is a photograph of an SDS-PAGE on which GST 
alone, GST.l 19-224, and GST.l 19-224.R154K incubated with 35 S-methionine labeled 
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HeLa cell lysate in the presence of phosphatase inhibitors were analyzed. Competition 
of p62 binding to the SH2 domain by phosphotyrosyl peptide was measured by adding 
10 mM pY324 peptide in the incubation mixture. 

Figures 15A-15C depict the results of experiments demonstrating regulation of 
5 p62 binding to the p56 ick SH2 domain by Ser59 phosphorylation of p56 lck . Figure 1 5A 
is an autoradiograph of an SDS-PAGE on which HeLa cell lysates (from HeLa cells 
transfected with v-src or vector alone, labelled with 35 S-methionine, and lysed in the 
presence or absence of phosphatase inhibitors) incubated with GST alone, GST.l 19-224, 
and GST.53-224 were analyzed. Samples that were lysed in the absence of phosphatase 

10 inhibitors were treated with exogenous recombinant phosphatase mixture (recombinant 
catalytic fragments of the tyrosine phosphatases LAR, CD45, and SHPTP- 1 ). Figure 
15B shows the same membrane as in Figure 15A but which was immunoblotted with 
anti-phosphotyrosine antibody. p62 and two phosphotyrosyl proteins (pp70 and pp80) 
are marked. Figure 15C is an autoradiograph on which HeLa cell lysates (from HeLa 

15 cells labelled with 35 S-methionine and lysed in the absence of phosphatase inhibitors) 
incubated with GST alone, GST. 1 1 9-224, GST.65-224, and GST.53-224.S59E were 
analyzed. This autoradiograph shows that truncation of the Ser59 region or mutation of 
Ser59 to Glu59 restores p62 binding to the SH2 domain. 

Figures 16A-16E depicts the results of experiments demonstrating that p62 is a 

20 novel polypeptide which binds to pi 20 ras-GAP. Figure 1 6A is an autoradiograph of an 
SDS-PAGE on which HeLa cell lysates (from HeLa cells labelled with 35 S-methionine 
and lysed in the presence or absence of phosphatase inhibitors) incubated with GST 
alone or with GST.l 19-224 and immunoprecipitated by ras-GAP were analyzed. A 
protein that comigrates with p62 is coimmunoprecipitated by ras-GAP. Figures 16B is 

25 autoradiograph of an SDS-PAGE and Figure 16C is a photograph of an SDS-PAGE 
stained with Coomassie blue on which the HeLa cell lysates described above were 
immunoprecipitated with anti-GAP antibody or with a preimmune serum. Recombinant 
p62 GAP binding protein (rp62 GAPb P) was run on SDS-PAGE along with GST.l 19-224 
and ras-GAP binding proteins of Figure 15 A. The prominent bands in Figure 16C are 

30 rp62 GAPb P (lane 1 ), antibody (lane 2), and fusion protein (lane 3). Figure 1 6D is an 
autoradiograph of an SDS-PAGE on which V8 partial digestions of p62 bound to 
GST.l 19-224 and ras-GAP were analyzed. Figure 16E depicts the amino acid sequence 
of a Lys-C digested peptide of purified p62. 

Figures 17A-17E depict the results of experiments demonstrating that one of the 

35 phosphotyrosine-independent proteins binding to the Lck SH2 domain is a ser/thr 

kinase. Figure 17A is an autoradiograph of an SDS-PAGE on which HeLa cell lysates 
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(from HeLa cells labelled with 35 S-methionine and lysed in the presence or absence of 
phosphatase inhibitors and competing peptide pY324) incubated with GST alone or with 
GST. 1 19-224 were analyzed (lanes 2, 4, 6, and 8). Kinase activity was also measured 
by incubating the bound proteins with kinase buffer and 32 P-g-ATP (lanes 1, 3, 5, and 
5 7). Figure 1 7B is an autoradiograph of an SDS-PAGE on which phosphorylation of 
myelin basic protein (MBP), incubated with sample aliquots from Figure 17 A, lanes 2, 
4, 6, and 8, kinase buffer, and 32 P-g-ATP, was visualized. Figure 1 7C is an 
autoradiograph of an SDS-PAGE on which MBP kinase activity (lane 1) was 
sequentially eluted with competing pY324 peptide (lane 2) and then with glutathione 

1 0 (lane 3) from glutathione-agarose bound to GST. 1 1 9-224 and its associated proteins 
(part of the sample shown in Figure 17 A, lane 6, was used). Figure 17D is a phospho- 
amino acid analysis of phosphorylated MBP of Figure 17B. Figure 17E is an 
autoradiograph of an MBP-containing gel on which GST and GST. 1 19-224 bound 
proteins in HeLa cell ly sates, prepared in the absence of NaV(>4 as described (lanes 1 

1 5 and 2 respectively) eluted either with NaVC>4 (lane 3) or with pY324 peptide (lane 4) 
were separated and subjected to kinase assay (Tobe, K. et al. (1992) J. Biol. Chem. 
267:21089-21097). For a positive control, 0.5 mg of purified p44.erkl (UB1) was used 
(lane 5). A sample of an in vitro kinase assay as described in (Figure 17A) ? lane 5, was 
separately run on a SDS-PAGE (lane 6) and compared with in-gel kinase assay. 

20 Figure 18 is the nucleotide sequence (SEQ ID NO:6) encoding the first full 

length human pi 60 (pi 60.1) (top) aligned for comparison to the nucleotide sequence 
(SEQ ID NO:8) encoding the second full length human pi 60 polypeptide (pi 60.2) 
(bottom). The regions of identity are marked by lines connecting the identical 
nucleotides. 

25 Figure 19 is the amino acid sequence (SEQ ID NO:7) encoding the first full 

length human pl60 (pl60.1) (top) aligned for comparison to the amino acid sequence 
(SEQ ID NO:9) encoding the second human pl60 polypeptide (pi 60.2) (bottom). The 
regions of identity are marked by lines connecting the identical amino acid residues. 

30 Detailed Description of the Invention 

The present invention pertains to the family of novel p62 polypeptides, or active 
portions thereof which are capable of, for example, modulating T or B cell development 
(e.g., T or B cell differentiation) and/or T or B cell activation by, for example, 
modulation of Lck activity. The p62 polypeptides of the invention are also capable of 
35 modulating degradation of cellular proteins, e.g., cell cycle regulatory proteins, 

stimulating expression of cell cycle dependent kinase inhibitors, and arresting cell cycle 
progression at specific boundaries, to thereby modulate cell proliferation, e.g„ cell 
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proliferation associated with tumor formation and growth. Other activities of the p62 
polypeptides of the invention are described herein. 

Particularly preferred p62 polypeptides are human polypeptides. The complete 
nucleotide (2083 nucleotides shown in Figure 1, SEQ ID NO: I) and amino acid 
5 sequence (440 amino acids shown in Figure 2, SEQ ID NO:2) of a first member of the 
p62 polypeptide family are disclosed herein. A plasmid containing the full length 
nucleotide sequence encoding this first p62 polypeptide was deposited with the 
American Type Culture Collection (ATCC) on December 19, 1995 and was assigned 
ATCC Accession Number 97387. This first p62 polypeptide family member is a human 

1 0 cytoplasmic polypeptide with a molecular weight of about 62kD and is expressed in a 
variety of tissues including heart, brain, placenta, lung, liver, skeletal muscle, kidney, 
and pancreas. The mRNA which encodes this polypeptide includes about 2kb. This p62 
polypeptide includes several defined domains. The N-terminal 50 amino acids (amino 
acid residues 1-50 of the amino acid sequence of Figure 2, SEQ ID NO:2, which are 

1 5 encoded by nucleotides 67-2 1 6 of the nucleotide sequence of Figure 1 , SEQ ID NO: 1 ) of 
the p62 polypeptide comprise an SH2 binding domain, e.g., an SH2 binding domain 
which does not include phosphotyrosine. A rac GTPase binding motif appears at amino 
acid residues 66-82 of Figure 2, SEQ ID NO:2 (which are encoded by nucleotides 262- 
312 as shown in Figure 1, SEQ ID NO:l) of the first p62 polypeptide. The rac GTPase 

20 binding motif can be compared as follows to the proposed consensus sequence for rac 
GTPase set forth in Zhou et al. ((1995) J. Biol Chem. 270:12665-12669) which also 
appears in human MEK5, scdl (see also Chang et aL (1994) Cell 79:131-141), and 
cdc24 (see also Miyamoto et al. ( 1 99 1 ) Biochem. Biophys. Res. Cornmun. 1 8 1 :604- 
610): 

25 



PROTEIN 


RAC GTPase CONSENSUS SEQUENCE 


p62 


66 HYRDEDGDLVAFSSDEE 82 


MEK5 


61 EYEDEDGDRITVRSDEE 77 


scdl 


786 KYVDEDGDFITITSDED 802 


cdc24 


696 KYQDEDGDFVVLGSDED 715 



The first p62 polypeptide also includes a zinc finger domain which comprises 
amino acid residues 128-163 of Figure 2, SEQ ID NO:2, which are encoded by 
nucleotides 448-555 of Figure 1, SEQ ID NO:l. In addition, an SH3 binding domain 
30 appears at amino acid residues 202-21 1 (encoded by nucleotides 670-699 of Figure 1 , 
SEQ ID NO:l) and a proline-glutamic acid-serine-threonine (PEST) rich motif appears 
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at amino acid residues 266-294 (encoded by nucleotides 862-954 of Figure 1, SEQ ID 
NO: 1 ). The presence of PEST motifs are typically associated with rapid degradation of 
the polypeptide which contains the motif. The first p62 polypeptide family member also 
includes at least two phosphorylation sites at threonine 269 of the amino acid sequence 
5 of Figure 2, SEQ ID NO:2 (encoded by nucleotides 871-873 of the nucleotide sequence 
shown in Figure 1, SEQ ID NO:l) and at serine 272 of the amino acid sequence shown 
in Figure 2, SEQ ID NO:2 (encoded by nucleotides 880-882 of the nucleotide sequence 
shown in Figure 1 , SEQ ID NO: 1 ). The C-terminus of the first p62 polypeptide includes 
an amino acid sequence comprising amino acid residues 323 to 440 of the amino acid 
10 sequence shown in Figure 2, SEQ ID NO;2 (encoded by nucleotides 1033 to 1386 of the 
nucleotide sequence shown in Figure I, SEQ ID NO:l), which comprise a ubiquitin 
binding domain. 

A nucleotide (1977 nucleotides shown in Figure 3, SEQ ID NO:3) and amino 
acid sequence (419 amino acids shown in Figure 4, SEQ ID NO:4) of a second member 

1 5 of the p62 polypeptide family are also disclosed herein. A plasmid containing the 

nucleotide sequence encoding this second p62 polypeptide has been deposited with the 
American Type Culture Collection (ATCC) on December 19, 1995 and was assigned 
ATCC Accession Number 97386. This second p62 polypeptide family member is also a 
human cytoplasmic polypeptide with a molecular weight of about 62kD and is expressed 

20 in a variety of tissues including B cells and other cells of hematopoietic origin, e.g., T 
cells. The mRNA which encodes this polypeptide includes about 2kb. This second p62 
polypeptide is encoded by a nucleic acid sequence which has a 77.5% overall sequence 
identity with the nucleotide sequences shown in Figure K SEQ ID NO:l . The amino 
acid sequence of the second p62 polypeptide has an 88.5% overall sequence identity 

25 with the amino acid sequence shown in Figure 2, SEQ ID NO:2. A comparison of the 
nucleotide sequences of the first p62 polypeptide and the second p62 polypeptide is 
shown in Figure 6. A comparison of the amino acid sequences of the first p62 
polypeptide and the second p62 polypeptide is shown in Figure 7. Like the first p62 
polypeptide, the second p62 polypeptide family member includes several defined 

30 domains. The SH2 binding domain of the second p62 polypeptide comprises at least 
amino acid residues 1-20 of the amino acid sequence of Figure 4, SEQ ID NO:4. A rac 
GTPase binding motif appears at amino acid residues 46-62 as shown in Figure 4, SEQ 
ID NO:4 (which are encoded by nucleotides 136-186 as shown in Figure 3, SEQ ID 
NO;3) of the second p62 polypeptide. The second p62 polypeptide also includes a zinc 

35 finger domain which comprises amino acid residues 108-143 of Figure 4, SEQ ID NO:4, 
which are encoded by nucleotides 322-429 of Figure 3 T SEQ ID NO:3. In addition, an 
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SH3 binding domain appears at amino acid residues 183-191 (encoded by nucleotides 
548-573 of Figure 3, SEQ ID NO:3) and a PEST motif appears at amino acid residues 
246-276 of Figure 4, SEQ ID NO:4 (encoded by nucleotides 736-828 of Figure 3, SEQ 
ID NO:3). The second p62 polypeptide family member also includes at least one 
5 phosphorylation site at threonine 249 of the amino acid sequence of Figure 4. SEQ ID 
NO:4 (encoded by nucleotides 745-747 of the nucleotide sequence shown in Figure 3, 
SEQ ID NO:3). The C-terminus of the second p62 polypeptide includes an amino acid 
sequence comprising amino acid residues 303-419 of the amino acid sequence shown in 
Figure 4, SEQ ID NO:4 (encoded by nucleotides 907-1257 of the nucleotide sequence 

10 shown in Figure 3, SEQ ID NO:3), which comprise a ubiquitin binding domain. 

Members of the human p62 polypeptide family are the first polypeptides shown 
to have both an SH2 binding domain and a ubiquitin binding domain. Furthermore, the 
p62 polypeptides bind to SH2 domains in a phosphotyrosine-independent manner. 
Although other proteins have been demonstrated as having this characteristic (see e.g., 

15 Malek, S.N. et al. (1994) J. Biol Chem. 269(52):33009-33020 ( p B0 PITSLRE protein); 
Cleghon, V. et al. (1994) J. Biol Chem. 269(26): 17749-1 7755 (raf-1 protein); Muller, 
A.J. et al. (1992) Mol Cell Biol 12(1 l):5087-5093 (BCR protein)), these proteins 
require phosphorylation of one or more of their serine residues. Binding of the p62 
polypeptides to an SH2 domain, e.g., the SH2 domain of Lck, however, does not require 

20 phosphorylation of a p62 serine residue. Moreover, neither the p!30 PITSLRE protein, 
the raf- 1 protein, nor the BCR protein, has been shown to include a ubiquitin binding 
domain. 

Accordingly, this invention pertains to p62 polypeptides and to active portions or 
fragments thereof, such as peptides having an activity of p62. The phrases "an activity 
25 of p62 n or "having a p62 activity" are used interchangeably herein to refer to molecules 
such as proteins, polypeptides, and peptides which have one or more of the following 
functional characteristics: 

(1) the p62 polypeptide binds to an SH2 domain, e.g., an SH2 domain which 
comprises an amino acid sequence having at least about 70% or more (e.g., 80%, 90%, 

30 95%, 97%, 98%) sequence identity with the amino acid sequence of the SH2 domain of 
p56 lck . In a preferred embodiment, the p62 polypeptide binds to the SH2 domain of 
p56 lck . The binding of the p62 polypeptide to an SH2 domain is preferably 
phosphotyrosine independent; 

(2) the p62 polypeptide binds, e.g., binds noncovalently, to ubiquitin, a ubiquitin 
35 analog, derivative or active fragment; 
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(3) the p62 polypeptide modulates T cell development (e.g., T cell 
differentiation) and/or T cell activation (e.g., lymphokine secretion); 

(4) the p62 polypeptide modulates B cell development (e.g., B cell 
differentiation) and/or B cell activation (e.g., antibody secretion); 

5 (5) the p62 polypeptide modulates (e.g., inhibits) ubiquitin-mediated degradation 

of cellular proteins such as cell cycle regulatory proteins (e.g., p53); 

(6) the p62 polypeptide modulates (e.g., stimulates) expression of cell cycle 
dependent kinase inhibitors (e.g., p21 c, P); 

(7) the p62 polypeptide binds to or interacts with proteins involved in the ras cell 
1 0 signaling cascade, e.g., pi 20-GAP; 

(8) the p62 polypeptide binds to or interacts with GTPase; 

(9) the p62 polypeptide modulates cell cycle progression, e.g., arrests cell cycle 
progression at, for example, the Gl/S boundary; 

(10) the p62 polypeptide modulates, e.g., inhibits, cell proliferation (e.g., cell 
15 proliferation associated with neoplasia); and 

(1 1) the p62 polypeptide associates with a Ser/Thr protein kinase activity. 

The p62 polypeptides can have different activities in different tissues. For 
example, in T and B cells, the p62 polypeptides can activate T or B cell development as 

20 described herein. In other cells, e.g., epithelial cells, e.g., HeLa cells, however, the p62 
polypeptides can inhibit cell cycle progression. 

The phrase "SH2 domain", as used herein, refers to a conserved sequence of 
approximately 100 amino acids found in many signal transduction proteins including 
Fps, Stc, Abl, GAP, PLCX, v-Crk, Nek, Lck, Fyn, p85, and Vav. See, e.g., Koch et al. 

25 ( 1 991 ) Science 252:668, incorporated herein by reference (provides the amino acid 
sequences of the SH2 domain of 27 proteins). The SH2 domain mediates protein- 
protein interactions between the SH2 containing protein and other proteins by 
recognition of a specific site on a second protein. The SH2/second protein site 
interaction usually results in an association of the SH2 contacting protein and the second 

30 protein. As used herein, SH2 domain refers to any sequence with at least about 70%, 
preferably at least about 80%, and more preferably at least about 90% or more (95%, 
97%-98%) sequence identity with a naturally occurring SH2 domain, e.g., the SH2 
domain of Lck (also referred to herein as M p56 lck ") as shown in Figure 5, SEQ ID NO:5. 
As used herein, the term "ubiquitin" is art recognized and refers to a polypeptide, 

35 e.g., a polypeptide of about 76 amino acids, which mediates degradation of intracellular 
proteins in eukaryotic cells. Ubiquitin modification of a variety of protein targets within 
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the cell is important in a number of basic cellular functions such as regulation of gene 
expression, regulation of the cell-cycle, modification of cell surface receptors, 
biogenesis of ribosomes, and DNA repair. Several key regulatory proteins are known to 
be degraded through the ubiquitin-mediated pathway, including certain transcriptional 
5 regulators, key enzymes of metabolic pathways, cyclins, and the tumor suppressor p53. 
Targeted proteins which undergo selective ubiquitin-mediated degradation are 
covalently tagged with ubiquitin through the formation of an isopeptide bond between 
the C-terminal glycyl residue of ubiquitin and a specific lysyl residue in the substrate 
protein. This process is catalyzed by a ubiquitin-activating enzyme (El) and a ubiquitin- 

1 0 conjugating enzyme (E2), and in some instances may also require auxiliary substrate 
recognition proteins (E3s). Following the linkage of the first ubiquitin chain, additional 
molecules of ubiquitin may be attached to lysine side chains of the previously 
conjugated moiety to form branched multi-ubiquitin chains. Once ubiquitin is 
conjugated to the target protein, a variety of evidence suggests that ubiquitin protein 

1 5 conjugates are degraded by a proteasome, a multi subunit protein complex. The term 
"ubiquitin" encompasses ubiquitin analogs, derivatives or active fragments thereof 
which are capable of mediating degradation of intracellular proteins as described herein. 

Ubiquitin binds to proteins via three known mechanisms. In the first 

20 mechanism, ubiquitin is conjugated to a target protein through an isopeptide bond 

between the C-terminal glycyl residue of ubiquitin and the e-amino group of a specific 
lysyl residue in the substrate protein. The second mechanism of ubiquitin binding to a 
target protein is a covalent binding of monoubiquitin to a protein such as that observed 
when ubiquitin binds to ubiquitin activating enzyme (El), ubiquitin conjugating enzyme 

25 (E2), or ubiquitin ligase (E3). This mechanism of binding uses an ATP-dependent 
thioester formation between a cysteine residue in the active site of these enzymes. 
Dissociation of these enzyme-ubiquitin complexes requires dithiothreitol (DTT). In the 
third mechanism, ubiquitin binds noncovalently to certain proteins such as ubiquitin 
hydrolase and deubiquitinase. This mode of interaction is a simple noncovalent protein- 

30 protein interaction. 

Association and dissociation of p62 with ubiquitin does not require ATP or DTT. 
This mode of binding indicates that the p62-ubiquitin interaction involves noncovalent 
binding. p62, however, does not share conserved regions with ubiquitin hydrolase and 
ubiquitinase. Furthermore, p62 cannot cleave covalently attached ubiquitin from a target 

35 protein. Thus, although p62-ubiquitin binding is noncovalent binding, the specific mode 
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of binding is unlike that previously demonstrated for ubiquitin hydrolase and 
deubiquitinase. 

As used herein, the phrase "cell cycle dependent kinase inhibitor" refers to 
molecules, e.g., proteins or peptides, which inhibit at least one cyclin dependent kinase 
5 (cdk). In the eukaryotic cell cycle, a key role is played by the cdks. Cdk complexes are 
formed via the association of a regulatory cyclin subunit and a catalytic kinase subunit. 
In mammalian cells, the combination of the kinase subunits (cdc2, cdk2, cdk4, cdk5, 
cdk6) with a variety of cyclin subunits (cyclin A, Bl, B2, Dl, D2, D3 and E) results in 
the assembly of functionally distinct kinase complexes. The coordinated activation of 

1 0 these complexes drives the cells through the cell cycle and ensures the fidelity of the 
process (Draetta (1990) Trends Biochem. Set. 15:378-382; Sherr (1993) Cell 73:1059- 
1065). Recently, a link has been established between the regulation of the activity of 
cyclin-dependent kinases and cancer by the discovery of a group of cdk inhibitors 
including p27 Ici P 1 J p21 Wafl/Cip] anc j pi6lnk4/MTSl. p 2l Wafl/Cipl i s positively regulated 

15 by the tumor suppressor p53 which is mutated in approximately 50% of all human 

cancers. Harper et al. (1993) Cell 75:805-816. p21 Wafl/Cipl may me diate the tumor 
suppressor activity of p53 at the level of cyclin-dependent kinase activity. The 
inhibitory activity of p27 Ki P 1 is induced by the negative growth factor TGF-p and by 
contact inhibition (Polyak et al. (1994) Cell 78:66-69). These proteins, when bound to 

20 cdk complexes, inhibit their kinase activity, thereby inhibiting progression through the 
cell cycle. Although their precise mechanism of action is unknown, it is thought that 
binding of these inhibitors to the cdk/cyclin complex prevents its activation. 
Alternatively, these inhibitors may interfere with the interaction of the enzyme with its 
substrates or its cofactors. In addition to modulating the expression of cdks, the p62 

25 polypeptides can be targets of the cdks, e.g., the p62 polypeptides can be 

phosphorylated, e.g., at one or more of the phosphorylation sites described herein, by a 
cdk. 

Proteins involved in the ras cell signaling pathway or cascade are art recognized. 
See, e.g., Murray, A. and Hunt, T. eds. The Cell Cycle: An Introduction (W.H. Freeman 

30 and Company, New York) pp. 1 09-1 1 0. Briefly, the ras cell signaling cascade begins 
with cell activation, e.g.. cell activation by a growth factor, and activation of the growth 
factor receptor. Receptor binding leads to the binding of adaptor proteins, such as 
GRB2 and SEM5, which contain SH2 and SH3 domains. The adaptor proteins activate 
guanine nucleotide-exchange proteins and GTPase activating proteins, e.g., pl20-GAP, 

35 which, in turn, activate small G proteins such as ras. Ras, which is a GTPase, in turn, 
induces activation and phosphorylation of raf, a protein kinase. Raf is the first member 
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of the protein kinase cascade which ultimately leads to the phosphorylation and 
activation of MAP kinase. Activation of MAP kinase leads to its translocation into the 
nucleus where it induces transcription. The p62 polypeptides of the present invention 
can bind to one or more of the molecules involved in the ras cell signaling cascade. 
5 Moreover, the p62 polypeptides of the invention can also be targets of the kinases of this 
cascade, e.g., the p62 polypeptides can be phosphorylated, e.g., at one or more of the 
phosphorylation sites described herein, by a kinase, e.g., MAP kinase, involved in the 
ras cascade. 

GTPases have been found to control processes as diverse as growth control, 
10 apoptosis, translation, vesicular transport, cytoskeletal organization, and nuclear 
transport (Chant, J. and Stowers, L. ( 1 995) Cell 81:1 -4). Examples of other known 
GTPases include rac, rho, and cdc42. p62 binding to a GTPase demonstrates that p62 
also controls a number of cellular events including focal adhesion and stress fiber 
formation, that are all important in cell growth and cell cycle progression. 
1 5 Polypeptides having a p62 activity can have any one or more of the activities 

described herein. An example of a preferred polypeptide having a p62 activity is a 
polypeptide which is capable of binding to an SH2 domain and to ubiquitin. 

Various aspects of the invention are described in further detail in the following 
subsections: 

20 

I. Isolated Nucleic acid Molecules 

One aspect of this invention pertains to isolated nucleic acid molecules that 
encode a novel p62 polypeptide, such as human p62, portions or fragments of such 
nucleic acids, or equivalents thereof The term "nucleic acid molecule" as used herein is 

25 intended to include such fragments or equivalents and refers to DNA molecules (e.g., 
cDNA or genomic DNA) and RNA molecules (e.g., mRNA). The nucleic acid molecule 
can be single-stranded or double-stranded, but preferably is double-stranded DNA. An 
"isolated" nucleic acid molecule is free of sequences which naturally flank the nucleic 
acid (i.e., sequences located at the 5' and 3' ends of the nucleic acid) in the genomic 

30 DNA of the organism from which the nucleic acid is derived. Moreover, an "isolated" 
nucleic acid molecule, such as a cDNA molecule, can be free of other cellular material. 

The term "equivalent" is intended to include nucleotide sequences encoding a 
functionally equivalent p62 polypeptide or functionally equivalent polypeptide or 
peptides having a p62 activity. Functionally equivalent p62 polypeptide or peptides 

35 include polypeptides which have one or more of the functional characteristics described 
herein. Other equivalents of p62 polypeptides include structural equivalents. Structural 
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equivalents of a p62 polypeptide preferably comprise an SH2 binding domain and a 
ubiquitin binding domain. Preferably the SH2 binding domain binds to the SH2 domain 
of Lck as set forth herein. Other preferred structural equivalents of p62 polypeptides 
include an SH2 binding domain, a ubiquitin binding domain, and optionally one or more 
5 of the domains present in p62 polypeptides described herein. Preferred nucleic acids of 
the invention include nucleic acid molecules comprising a nucleotide sequence provided 
in Figure 1 (SEQ ID NO: 1 ) or Figure 3 (SEQ ID NO:3) or fragments, portions or 
equivalents thereof. 

In one embodiment, the invention pertains to a nucleic acid molecule which is a 
10 naturally occuring form of a nucleic acid molecule encoding a p62 polypeptide, such as 
a p62 polypeptide having an amino acid sequence shown in Figure 2 (SGQ ID NO:2) or 
Figure 4 (SEQ ID NO:4). A naturally occuring form of a nucleic acid encoding p62 is 
derived from hematopoietic cells. Such naturally occuring equivalents can be obtained, 
for example, by screening a cDNA library, prepared with RNA from hematopoietic 
1 5 cells, with a nucleic acid molecule having a sequence shown in Figure 1 (SEQ ID NO: 1 ) 
or Figure 3 (SEQ ID NO:3) under high stringency hybridization conditions. Such 
conditions are further described herein. 

Also within the scope of the invention are nucleic acids encoding natural variants and 
iso forms of p62 polypeptides, such as splice forms. Such natural variants are within the 

20 scope of the invention. 

In a preferred embodiment, the nucleic acid molecule encoding a p62 
polypeptide is a cDN A. Preferably, the nucleic acid molecule is a cDNA molecule 
consisting of at least a portion of a nucleotide sequence encoding human p62, as shown 
in Figure 1 (SEQ ID NO: 1 ) or as shown in Figure 3 (SEQ ID NO:3). A preferred 

25 portion of the cDNA molecule of Figure 1 (SEQ ID NO: 1 ) or Figure 3 (SEQ ID NO:3) 
includes the coding region of the molecule. Other preferred portions include those 
which code for domains of p62, such as the SH2 binding domain, the GTPase binding 
domain, the zinc finger domain, the domain containing at least one of the above- 
described phosphorylation sites, and the ubiquitin binding, or any combination thereof. 

30 Additional regions of the nucleic acid molecules of the invention encode polypeptides 
which comprise an SH3 binding domain and a PEST domain. In another 
embodiment, the nucleic acid of the invention encodes a p62 polypeptide or an active 
portion or fragment thereof having an amino acid sequence shown in Figure 2 (SEQ ID 
NO:2) or in Figure 4 (SEQ ID NO:4). In yet another embodiment, preferred nucleic acid 

35 molecules encode a polypeptide having an overall amino acid sequence identity of at 
least about 50%, more preferably at least about 60%, more preferably at least about 
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70%, more preferably at least about 80%, and most preferably at least about 90% or 
more with an amino acid sequence shown in Figure 2 (SEQ ID NO:2) or Figure 4 (SEQ 
ID NO:4). Nucleic acid molecules which encode peptides having an overall amino acid 
sequence identity of at least about 93%, more preferably at least about 95%, and most 
5 preferably at least about 98-99% with a sequence set forth in Figure 2 (SEQ ID NO:2) or 
Figure 4 (SEQ ID NO:4) are also within the scope of the invention. Homology, also 
termed herein "identity" refers to sequence similarity between two protein (peptides) or 
between two nucleic acid molecules. Homology can be determined by comparing a 
position in each sequence which may be aligned for purposes of comparison. When a 

1 0 position in the compared sequences is occupied by the same nucleotide base or amino 
acid, then the molecules are homologous, or identical, at that position. A degree (or 
percentage) of homology between sequences is a function of the number of matching or 
homologous positions shared by the sequences. 

Isolated nucleic acids encoding a peptide having a p62 activity, as described 

1 5 herein, and having a sequence which differs from nucleotide sequence shown in Figure 1 
(SEQ ID NO:l) or Figure 3 (SEQ ID NO:3) due to degeneracy in the genetic code are 
also within the scope of the invention. Such nucleic acids encode functionally 
equivalent peptides (e.g., having a p62 activity) or structurally equivalent polypeptides 
but differ in sequence from the sequence of Figure 2 (SEQ ID NO:2) or Figure 4 (SEQ 

20 ID NO:4) due to degeneracy in the genetic code. For example, a number of amino acids 
are designated by more than one triplet. Codons that specify the same amino acid, or 
synonyms (for example, CAU and CAC are synonyms for histidine) may occur due to 
degeneracy in the genetic code. As one example, DNA sequence polymorphisms within 
the nucleotide sequence of a p62 polypeptide (especially those within the third base of a 

25 codon) may result in "silent" mutations in the DNA which do not affect the amino acid 
encoded. However, it is expected that DNA sequence polymorphisms that do lead to 
changes in the amino acid sequences of the p62 polypeptide will exist within a 
population. It will be appreciated by one skilled in the art that these variations in one or 
more nucleotides (up to about 3-4% of the nucleotides) of the nucleic acids encoding 

30 peptides having the activity of a p62 polypeptide may exist among individuals within a 
population due to natural allelic variation. Any and all such nucleotide variations and 
resulting amino acid polymorphisms are within the scope of the invention. Furthermore, 
there are likely to be isoforms or family members of the p62 polypeptide family in 
addition to those described herein. Such isoforms or family members are defined as 

35 proteins related in function and amino acid sequence to a p62 polypeptide, but encoded 
by genes at different loci. Such isoforms or family members are within the scope of the 
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invention. Additional members of the p62 polypeptide family can be isolated by, for 
example, screening a library of interest under low stringency conditions described herein 
or by screening or amplifying with degenerate probes derived from highly conserved 
amino acids sequences, for example, from lh£ amino acid sequences in Figure 2, SEQ ID 
5 NO:2 or in Figure 4, SEQ ID NO:4. Alternatively, other members of the p62 
polypeptide family as well as the remaining N-terminal portion of the second p62 
polypeptide described herein, can be isolated using one or more of the following 
techniques. For example, the Daudi cell library which was initially screened to obtain 
the second p62 cDNA (i.e,, by analyzing three positive clones from a pool of 0.5 x 1 0 5 

10 individual colonies) can be further screened by analyzing 5 x 10 5 individual colonies. 
This library can be screened using a 150 base pair probe obtained from the 5' end of the 
cDNA shown in Figure 3, SEQ ID NO:3. Alternatively, using a protocol known as 
RACE ("Rapid Amplification of cDNA End" described in Frohman, MA. PCR 
Protocols (Academic Press, Inc. 1990) pp. 28-38, the missing 5' end of the nucleotide 

1 5 sequence encoding the second p62 polypeptide can be obtained. The RACE protocol 
begins with a purification of 1 fig of poly A RNA from cultured Daudi cells. The polyA 
RNA is then used as a template for the RACE reaction. A gene specific primer encoding 
a 1 7-mer minus strand complementary to nucleotide 1 1 to 27 of SEQ ID NO:3 
(AGCGGCGGAATTCCACC (SEQ ID NO:22)) is then used to extend the 5' end of the 

20 cDNA by AMV reverse transcriptase. A homopolymer (oligo dC) is then appended by 
using terminal transferase to tail the first-strand reaction product. Finally, amplification 
by PCR is accomplished using a gene specific primer synthesized as described above 
and a hybrid primer containing oligo dG. The amplified gene product can then be 
sequenced. Other techniques for isolating additional members of the p62 polypeptide 

25 family as well as the N-terminal portion of the second p62 polypeptide include screening 
a genomic B cell library to obtain genes of the p62 family. Positive clones are then 
analyzed and sequenced to obtain additional family members. 

A "fragment" or "portion" of a nucleic acid encoding a p62 polypeptide is 
defined as a nucleotide sequence having fewer nucleotides than the nucleotide sequence 

30 encoding the entire amino acid sequence of a p62 polypeptide, such as human p62. A 
fragment or portion of a nucleic acid molecule is at least about 20 nucleotides, 
preferably at least about 30 nucleotides* more preferably at least about 40 nucleotides, 
even more preferably at least about 50 nucleotides in length. Also within the scope of 
the invention are nucleic acid fragments which are at least about 60, 70, 80, 90, 100 or 

35 more nucleotides in length. Preferred fragments or portions include fragments which 

encode a polypeptide having a p62 activity as described herein. To identify fragments of 
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portions of the nucleic acids encoding fragments or portions of polypeptides which have 
a p62 activity, several different assays can be employed. For example, to determine the 
binding characteristics of p62 peptides, commonly practiced binding studies, for 
example, those described in the Examples section herein can be performed to obtain p62 
5 peptides which bind to, for example, an SH2 domain, ubiquitin, or GTPase. 

For determining whether a p62 polypeptide or portion or fragment thereof, such 
as a fragment of human p62 is capable of modulating T cell activity, such as T cell 
proliferation or lymphokine secretion, e.g., IL-2 secretion, the polypeptide, is added to a 
culture of T cells, such as CD4+ T cells, and incubated in the presence of a primary 

10 activation signal, such as an anti-CD3 antibody and various amounts of a p62 portion or 
fragment. Following incubation for about 3 days, a proliferation assay is performed, 
which is indicative of the proliferation rate of the T cells. Thus, a fragment of a p62 
antigen which is capable of costimulating T cells is a fragment of a p62 antigen which in 
the presence of a primary T cell activation signal stimulates the T cells to proliferate at a 

15 rate that is greater than proliferation rate of T cells contacted only with a primary 

activation signal. Proliferation assays can also be performed as described in the PCT 
Application No. PCT/US94/08423. Lymphokine secretion, e.g., secretion of the 
lymphokines IL-2, tumor necrosis factor (TNF), granulocyte-macrophage-colony 
stimulating factor (GM-CSF), and gamma interferon can be measured using standard 

20 assays. Alternatively, T cells transfected with a cDNA encoding a p62 polypeptide or 
fragment or portion thereof which has a p62 activity can be used to screen for agents 
which inhibit p62. In such cells, the level of IL-2 gene activation and/or level of 
stimulation could be measured to indicate inhibition or activation of p62. 

Another aspect of the invention provides a nucleic acid which hybridizes under 

25 high or low stringency conditions to a nucleic acid which encodes a peptide having ail or 
a portion of an amino acid sequence shown in Figure 2 (SEQ ID NO:2) or Figure 4 
(SEQ ID NO:4). Appropriate stringency conditions which promote DNA hybridization, 
for example, 6.0 X sodium chloride/sodium citrate (SSC) at about 45°C, followed by a 
wash of 2.0 X SSC at 50°C are known to those skilled in the art or can be found in 

30 Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. 
For example, the salt concentration in the wash step can be selected from a low 
stringency of about 2.0 X SSC at 25 °C to a high stringency of about 0.2 X SSC at 65°C. 
In addition, the temperature in the wash step can be increased from low stringency 
conditions at room temperature, about 22°C, to high stringency conditions, at about 

35 65°C. Preferably, an isolated nucleic acid molecule of the invention that hybridizes 

under stringent conditions to the sequence of Figure 1, SEQ ID NO:l or Figure 3, SEQ 
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ID NO: 3 corresponds to a naturally-occurring nucleic acid molecule. As used herein, a 
"naturally-occurring" nucleic acid molecule refers to an RNA or DNA molecule having 
a nucleotide sequence that occurs in nature (e.g., encodes a natural protein). In one 
embodiment, the nucleic acid encodes a natural p62 polypeptide. 
5 In addition to naturally-occurring allelic variants of the p62 sequence that may 

exist in the population, the skilled artisan will further appreciate that changes may be 
introduced by mutation into the nucleotide sequence of Figure 1, SEQ ID NO: 1 or 
Figure 3, SEQ ID NO:3, thereby leading to changes in the amino acid sequence of the 
encoded p62 polypeptide, without altering the functional ability of the p62 polypeptide. 

1 0 For example, nucleotide substitutions leading to amino acid substitutions at "non- 
essential" amino acid residues may be made in the sequence of Figure 1, SEQ ID NO:l 
or Figure 3, SEQ ID NO:3. A "non-essential" amino acid residue is a residue that can be 
altered from the wild-type sequence of p62 (e.g., the sequence of Figure 2, SEQ ID 
NO:2 or Figure 4, SEQ ID NO:4) without altering the p62 activity of the polypeptide. 

1 5 An isolated nucleic acid molecule encoding a p62 polypeptide homologous to the 

protein of Figure 2, SEQ ID NO:2 or Figure 4, SEQ IDNO:4 can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the 
nucleotide sequence of Figure 1 , SEQ ID NO: 1 or Figure 3 , SEQ ID NO:3 such that one 
or more amino acid substitutions, additions or deletions are introduced into the encoded 

20 polypeptide. Mutations can be introduced into Figure 1, SEQ ID NO:l or Figure 3, SEQ 
ID NO:3 by standard techniques, such as site-directed mutagenesis and PCR-mediated 
mutagenesis. Preferably, conservative amino acid substitutions are made at one or more 
predicted non-essential amino acid residues. A "conservative amino acid substitution" is 
one in which the amino acid residue is replaced with an amino acid residue having a 

25 similar side chain. Families of amino acid residues having similar side chains have been 
defined in the art, including basic side chains (e.g., lysine, arginine, histidine), acidic 
side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, 
asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., 
alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), 

30 beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains 
(e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential 
amino acid residue in p62 is preferably replaced with another amino acid residue from 
the same side chain family. Alternatively, in another embodiment, mutations can be 
introduced randomly along all or part of a p62 coding sequence, such as by saturation 

35 mutagenesis, and the resultant mutants can be screened for proteolytic activity to 
identify mutants that retain proteolytic activity. Following mutagenesis of the 
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nucleotide sequence of Figure 1, SEQ ID NO:l or Figure 3, SEQ ID NO:3, the encoded 
polypeptide can be expressed recombinantly and activity of the protein can be 
determined. 

In addition to the nucleic acid molecules encoding p62 polypeptides described 
5 above, another aspect of the invention pertains to isolated nucleic acid molecules which 
are antisense thereto. An "antisense" nucleic acid comprises a nucleotide sequence 
which is complementary to a "sense" nucleic acid encoding a protein, e.g., 
complementary to the coding strand of a double-stranded cDNA molecule or 
complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can 

1 0 hydrogen bond to a sense nucleic acid. 

The antisense nucleic acid can be complementary to an entire p62 coding strand, or to 
only a portion thereof In one embodiment, an antisense nucleic acid molecule is 
antisense to a "coding region" of the coding strand of a nucleotide sequence encoding 
p62. The term "coding region" refers to the region of the nucleotide sequence 

1 5 comprising codons which are translated into amino acid residues (e,g., the entire coding 
region of Figure U SEQ ID NO: 1 or Figure 3, SEQ ID NO:3). In another embodiment, 
the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding 
strand of a nucleotide sequence encoding p62. The term "noncoding region" refers to 5' 
and 3' sequences which flank the coding region that are not translated into amino acids 

20 (i.e., also referred to as 5 1 and 3' untranslated regions). 

Given the coding strand sequences encoding p62 polypeptides disclosed herein 
(e.g., Figure 1, SEQ ID NO:l and Figure 3, SEQ ID NO:3), antisense nucleic acids of 
the invention can be designed according to the rules of Watson and Crick base pairing. 
The antisense nucleic acid molecule can be complementary to the entire coding region of 

25 p62 mRNA, but more preferably is an oligonucleotide which is antisense to only a 
portion of the coding or noncoding region of p62 mRNA. For example, the antisense 
oligonucleotide can be complementary to the region surrounding the translation start site 
of p62 mRNA. An antisense oligonucleotide can be, for example, about 1 5, 20, 25, 30, 
35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 

30 constructed using chemical synthesis and enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the 
molecules or to increase the physical stability of the duplex formed between the 

35 antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine 
substituted nucleotides can be used. Alternatively, the antisense nucleic acid can be 
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produced biologically using an expression vector into which a nucleic acid has been 
subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic 
acid will be of an antisense orientation to a target nucleic acid of interest, described 
further in the following subsection), 
5 In another embodiment, an antisense nucleic acid of the invention is a ribozyrne. 

Ribozymes are catalytic RNA molecules with ribonuclease activity which are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. A ribozyme having specificity for a p62-encoding nucleic acid 
can be designed based upon the nucleotide sequence of a p62 cDNA disclosed herein 
10 (i.e., Figure 1, SEQ ID NO:l or Figwe 3, SEQ ID NO:3). See, e.g., Cech et al. U.S. 
Patent No. 4,987,071 ; and Cech et al. U.S. Patent No. 5,1 16,742. Alternatively, p62 
mRNA can be used to select a catalytic RNA having a specific ribonuclease activity 
from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J.W. (1993) Science 
261: 14U-I418. 

1 5 The nucleic acid sequences of the invention can also be chemically synthesized 

using standard techniques. Various methods of chemically synthesizing 
polydeoxynucleotides are known, including solid-phase synthesis which, like peptide 
synthesis, has been fully automated in commercially available DNA synthesizers (See 
e.g., Itakura et al. U.S. Patent No. 4,598,049; Caruthers et aL U.S. Patent No. 4,458,066; 

20 and Itakura U.S. Patent Nos. 4,401 ,796 and 4,373,071 , incorporated by reference 
herein). 

II. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression 

25 vectors, containing a nucleic acid encoding p62 (or a portion or fragment thereof). As 
used herein, the term "vector" refers to a nucleic acid molecule capable of 
transporting another nucleic acid to which it has been linked. One type of vector is a 
"plasmid", which refers to a circular double stranded DNA loop into which additional 
DNA segments may be ligated. Another type of vector is a viral vector, wherein 

30 additional DNA segments may be ligated into the viral genome. Certain vectors are 
capable of autonomous replication in a host cell into which they are introduced (e.g., 
bacterial vectors having a bacterial origin of replication and episomal mammalian 
vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into 
the genome of a host cell upon introduction into the host cell, and thereby are 

35 replicated along with the host genome. Moreover, certain vectors are capable of 

directing the expression of genes to which they are operatively linked. Such vectors 
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are referred to herein as "expression vectors". In general, expression vectors of utility 
in recombinant DNA techniques are often in the form of plasmids. In the present 
specification, "plasmid" and "vector" may be used interchangeably as the plasmid is 
the most commonly used form of vector. However, the invention is intended to 
5 include such other forms of expression vectors, such as viral vectors (e.g., replication 
defective retroviruses, adenoviruses and adeno-associated viruses), which serve 
equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of 
the invention in a form suitable for expression of the nucleic acid in a host cell, which 

10 means that the recombinant expression vectors include one or more regulatory 

sequences, selected on the basis of the host cells to be used for expression, which is 
operatively linked to the nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 

15 of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a 
host cell when the vector is introduced into the host cell). The term "regulatory 
sequence" is intended to includes promoters, enhancers and other expression control 
elements (e.g., polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, 

20 Academic Press, San Diego, CA (1990). Regulatory sequences include those which 
direct constitutive expression of a nucleotide sequence in many types of host cell and 
those which direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art 
that the design of the expression vector may depend on such factors as the choice of the 

25 host cell to be transformed, the level of expression of protein desired, etc. The 

expression vectors of the invention can be introduced into host cells to thereby produce 
proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as 
described herein (e.g., p62 polypeptides, mutant forms of p62, fusion proteins, etc.). 
The recombinant expression vectors of the invention can be designed for 

30 expression of p62 in prokaryotic or eukaryotic cells. For example, p62 can be expressed 
in bacterial cells such as E. co//, insect cells (using baculovirus expression vectors) yeast 
cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene 
Expression Technology: Methods in Enzymology 1 85, Academic Press, San Diego, CA 
(1990). Alternatively, the recombinant expression vector may be transcribed and 

35 translated in vitro, for example using T7 promoter regulatory sequences and T7 
polymerase. 



WO 97/22255 



PCTAJS96/19944 



-30- 

Expression of proteins in prokaryotes is most often carried out in £. coli with 
vectors containing constitutive or inducible promoters directing the expression of either 
fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
5 vectors typically serve three purposes: I) to increase expression of recombinant protein; 
2) to increase the solubility of the recombinant protein; and 3) to aid in the purification 
of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion 
expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
moiety and the recombinant protein to enable separation of the recombinant protein from 

10 the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and 
their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D.B. 
and Johnson, ICS. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, MA) 
and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase (GST), 

1 5 maltose E binding protein, or protein A, respectively, to the target recombinant protein. 

Examples of suitable inducible non-fusion E. coli expression vectors include 
pTrc (Amann et ah, (1988) Gene 69:301-315) and pET 1 Id (Studier et ah, Gene 
Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
California (1990) 60-89). Target gene expression from the pTrc vector relies on host 

20 RNA polymerase transcription from a hybrid trp-lac fusion promoter. Target gene 
expression from the pET lid vector relies on transcription from a T7 gnlO-lac fusion 
promoter mediated by a coexpressed viral RNA polymerase (T7 gnl). This viral 
polymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from a resident X 
prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 

25 promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express 
the protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein (Gottesman, S., Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, California (1990) 119-128). Another 

30 strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector so that the individual codons for each amino acid are those 
preferentially utilized in £ coli (Wada et al M (1992) Nuc. Acids Res. 20:21 11-2118). 
Such alteration of nucleic acid sequences of the invention can be carried out by standard 
DNA synthesis techniques. 

35 In another embodiment, the p62 expression vector is a yeast expression vector. 

Examples of vectors for expression in yeast S. cerivisae include pYepSecl (Baldari. et 
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al., (1987) EmboJ. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 30:933- 
943), pJRY88 (Schultz et al., (1987) Gene 54:1 1 3-123), and pYES2 (lnvitrogen 
Corporation, San Diego, CA). 

Alternatively, p62 can be expressed in insect cells using baculovirus expression 
5 vectors. Baculovirus vectors available for expression of proteins in cultured insect cells 
(e.g., Sf 9 cells) include the pAc series (Smith et al., (1983) Mol Cell Biol 3:2156- 
2165) and the pVL series (Lucklow, V.A., and Summers, M.D., (1989) Virology 170:31- 
39). 

In yet another embodiment, a nucleic acid of the invention is expressed in 

10 mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B., (1987) Nature 329:840) and pMT2PC 
(Kaufman et al. (1987), EMBOJ. 6:187-195). When used in mammalian cells, the 
expression vector's control functions are often provided by viral regulatory elements. 
For example, commonly used promoters are derived from polyoma, adenovirus 2, 

1 5 cytomegalovirus and Simian Virus 40. In another embodiment, the recombinant 
mammalian expression vector is capable of directing expression of the nucleic acid 
preferentially in a particular cell type (e.g., tissue-specific regulatory elements are used 
to express the nucleic acid). Tissue-specific regulatory elements are known in the art. 
Non-limiting examples of suitable tissue-specific promoters include the albumin 

20 promoter (liver-specific; Pinkert et al. (1987) Genes Dev. 1 :268-277), lymphoid-specific 
promoters (Calame and Eaton (1988) Adv. Immunol. 43:235-275), in particular 
promoters of T cell receptors (Winoto and Baltimore (1989) EMBOJ. 8:729-733) and 
immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) 
Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter; Byrne 

25 and Ruddle (1989) Proc. Natl. Acad Set USA 86:5473-5477), pancreas-specific 

promoters (Edlund et al. (1985) Science 230:912-916), and mammary gland-specific 
promoters (e.g., milk whey promoter; U.S. Patent No. 4,873,3 1 6 and European 
Application Publication No. 264,166). Developmental ly-regulated promoters are also 
encompassed, for example the murine hox promoters (Kessel and Gruss (1990) Science 

30 249:374-379) and the ct-fetoprotein promoter (Campes and Tilghman ( 1 989) Genes Dev. 
3:537-546). 

In one embodiment, a recombinant expression vector containing DNA encoding 
a p62 fusion protein is produced. A p62 fusion protein can be produced by recombinant 
expression of a nucleotide sequence encoding a first polypeptide peptide having a p62 
35 activity and a nucleotide sequence encoding a second polypeptide having an amino acid 
sequence unrelated to an amino acid sequence selected from the group consisting of an 
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amino acid sequence shown in Figure 2 (SEQ ID NO;2) and Figure 4 (SEQ ID NO:4). 
In many instances, the second polypeptide correspond to a moiety that alters a 
characteristic of the first peptide, e.g., its solubility, affinity, stability or valency. For 
example, a p62 polypeptide of the present invention can be generated as a glutathione-S- 
5 transferase (GST- fusion protein). Such GST fusion proteins can enable easy 
purification of the p62 polypeptide, such as by the use of glutathione-derivatized 
matrices (see, for example, Current Protocols in Molecular Biology? eds. Ausabel et al. 
(N.Y.: John Wiley & Sons, 1991)). Preferably the fusion proteins of the invention are 
functional in a two hybrid assay. Fusion proteins and peptides produced by recombinant 

10 techniques may be secreted and isolated from a mixture of cells and medium containing 
the protein or peptide. Alternatively, the protein or peptide may be retained 
cytoplasmically and the cells harvested, lysed and the protein isolated. A cell culture 
typically includes host cells, media and other byproducts. Suitable media for cell culture 
are well known in the art. Protein and peptides can be isolated from cell culture 

1 5 medium, host cells, or both using techniques known in the art for purifying proteins and 
peptides. Techniques for transfecting host cells and purifying proteins and peptides are 
described in further detail herein. 

The invention further provides a recombinant expression vector comprising a DN A 
molecule of the invention cloned into the expression vector in an antisense orientation. 

20 That is, the DNA molecule is operatively linked to a regulatory sequence in a manner 

which allows for expression (by transcription of the DNA molecule) of an RNA molecule 
which is antisense to p62 RNA. Regulatory sequences operatively linked to a nucleic acid 
cloned in the antisense orientation can be chosen which direct the continuous expression of 
the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or 

25 enhancers, or regulatory sequences can be chosen which direct constitutive, tissue specific 
or cell type specific expression of antisense RNA. The antisense expression vector can be 
in the form of a recombinant plasmid, phagemid or attenuated virus in which antisense 
nucleic acids are produced under the control of a high efficiency regulatory region, the 
activity of which can be determined by the cell type into which the vector is introduced. 

30 For a discussion of the regulation of gene expression using antisense genes see Weintraub, 
H. et al., Antisense RNA as a molecular tool for genetic analysis, Reviews - Trends in 
Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to recombinant host cells into which 
a recombinant expression vector of the invention has been introduced. The terms 

35 "host cell" and "recombinant host cell" are used interchangeably herein. It is 
understood that such terms refer not only to the particular subject cell but to the 
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progeny or potential progeny of such a cell Because certain modifications may occur 
in succeeding generations due to either mutation or environmental influences, such 
progeny may not, in fact, be identical to the parent celK but are still included within 
the scope of the term as used herein. 
5 A host cell may be any prokaryotic or eukaryotic cell. For example, a p62 

polypeptide can be expressed in bacterial cells such as £ coli, insect cells, yeast or 
mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other 
suitable host cells are known to those skilled in the art. 

Vector DN A can be introduced into prokaryotic or eukaryotic cells via 

1 0 conventional transformation or transfection techniques. As used herein, the terms 

"transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 
calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, or electroporation. Suitable methods for transforming or 

15 transfecting host cells can be found in Sambrook et al. (Molecular Cloning: A 

Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and 
other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may 

20 integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Preferred 
selectable markers include those which confer resistance to drugs, such as G41 8, 
hygromycin and methotrexate. Nucleic acid encoding a selectable marker may be 

25 introduced into a host cell on the same vector as that encoding p62 or may be introduced 
on a separate vector. Cells stably transfected with the introduced nucleic acid can be 
identified by drug selection (e.g., cells that have incorporated the selectable marker gene 
will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

30 culture, can be used to produce (i.e., express) p62 polypeptide. Accordingly, the 

invention further provides methods for producing p62 polypeptides using the host cells 
of the invention. In one embodiment, the method comprises culturing the host cell of 
invention (into which a recombinant expression vector encoding p62 has been 
introduced) in a suitable medium until p62 is produced. In another embodiment, the 

35 method further comprises isolating p62 from the medium or the host cell. 
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The host cells of the invention can also be used to produce nonhuman transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized 
oocyte or an embryonic stem cell into which p62-coding sequences have been 
introduced. Such host cells can then be used to create non-human transgenic animals in 
5 which exogenous p62 sequences have been introduced into their genome or homologous 
recombinant animals in which endogenous p62 sequences have been altered. Such 
animals are useful for studying the function and/or activity of p62 and for identifying 
and/or evaluating modulators of p62 activity. As used herein, a "transgenic animal" is a 
non-human animal, preferably a mammal, more preferably a mouse, in which one or 

10 more of the cells of the animal includes a transgene. A transgene is exogenous DNA 
which is integrated into the genome of a cell from which a transgenic animal develops 
and which remains in the genome of the mature anrmal, thereby directing the expression 
of an encoded gene product in one or more cell types or tissues of the transgenic animal. 
As used herein, a "homologous recombinant animal" is a non-human animal, preferably 

1 5 a mammal, more preferably a mouse, in which an endogenous p62 gene has been altered 
by homologous recombination between the endogenous gene and an exogenous DNA 
molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior 
to development of the animal. 

A transgenic animal of the invention can be created by introducing p62-encoding 

20 nucleic acid into the male pronuclei of a fertilized oocyte, e.g., by microinjection, and 
allowing the oocyte to develop in a pseudopregnant female foster animal. The human 
p62 cDNA sequence of Figure 1, SEQ ID NO:l or Figure 3, SEQ ID NO:3 can be 
introduced as a transgene into the genome of a non-human animal. Alternatively, a 
nonhuman homologue of the human p62 gene, such as a mouse p62 gene, can be 

25 isolated based on hybridization to the human p62 cDNA (described further in subsection 
I above) and used as a transgene. Intronic sequences and polyadenylation signals can 
also be included in the transgene to increase the efficiency of expression of the 
transgene. A tissue-specific regulatory sequence(s) can be operably linked to the p62 
transgene to direct expression of a p62 polypeptide to particular cells. Methods for 

30 generating transgenic animals via embryo manipulation and microinjection, particularly 
animals such as mice, have become conventional in the art and are described, for 
example, in U.S. Patent Nos. 4,736,866 and 4,870,009, both by Leder et aL, U.S. Patent 
No. 4,873,191 by Wagner et al. and in Hogan, B., Manipulating the Mouse Embryo, 
(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar 

35 methods are used for production of other transgenic animals. A transgenic founder 
animal can be identified based upon the presence of the p62 transgene in its genome 
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and/or expression of p62 mRNA in tissues or cells of the animals. A transgenic founder 
animal can then be used to breed additional animals carrying the transgene. Moreover, 
transgenic animals carrying a transgene encoding p62 can further be bred to other 
transgenic animals carrying other transgenes. 
5 To create a homologous recombinant animal, a vector is prepared which contains 

at least a portion of a p62 gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the p62 gene. The p62 gene can be 
a human gene (e.g., from a human genomic clone isolated from a human genomic 
library screened with the cDNA of Figure 1, SEQ ID NO:l or Figure 3, SEQ ID NO:3), 

10 but more preferably, is a non-human homologue of a human p62 gene. For example, a 
mouse p62 gene can be isolated from a mouse genomic DNA library using the human 
p62 cDNA of Figure 1 , SEQ ID NO: 1 or Figure 3, SEQ ID NO:3 as a probe. The mouse 
p62 gene then can be used to construct a homologous recombination vector suitable for 
altering an endogenous p62 gene in the mouse genome. In a preferred embodiment, the 

15 vector is designed such that, upon homologous recombination, the endogenous p62 gene 
is functionally disrupted (i.e., no longer encodes a functional protein; also referred to as 
a "knock out" vector). Alternatively, the vector can be designed such that, upon 
homologous recombination, the endogenous p62 gene is mutated or otherwise altered 
but still encodes functional protein (e.g., the upstream regulatory region can be altered to 

20 thereby alter the expression of the endogenous p62 polypeptide). In the homologous 
recombination vector, the altered portion of the p62 gene is flanked at its 5 f and 3* ends 
by additional nucleic acid of the p62 gene to allow for homologous recombination to 
occur between the exogenous p62 gene carried by the vector and an endogenous p62 
gene in an embryonic stem cell. The additional flanking p62 nucleic acid is of sufficient 

25 length for successful homologous recombination with the endogenous gene. Typically, 
several kilobases of flanking DNA (both at the 5' and 3 1 ends) are included in the vector 
(see e.g., Thomas, K.R. and Capecchi, M. R. (1987) Cell 51 :503 for a description of 
homologous recombination vectors). The vector is introduced into an embryonic stem 
cell line (e.g., by electroporation) and cells in which the introduced p62 gene has 

30 homologously recombined with the endogenous p62 gene are selected (see e.g., Li, E. et 
al. (1 992) Cell 69:91 5). The selected cells are then injected into a blastocyst of an 
animal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley, A. in 
Teratocarcinomas and Embryonic Stem Cells: A Practical Approach, E.J, Robertson, 
ed. (IRL, Oxford, 1987) pp. 1 13-152). A chimeric embryo can then be implanted into a 

35 suitable pseudopregnant female foster animal and the embryo brought to term. Progeny 
harboring the homologously recombined DNA in their germ cells can be used to breed 
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animals in which all cells of the animal contain the homologously recombined DN A by 
germline transmission of the transgene. Methods for constructing homologous 
recombination vectors and homologous recombinant animals are described further in 
Bradley, A. (1991) Current Opinion in Biotechnology 2:823-829 and in PCT 
5 International Publication Nos.: WO 90/1 1354 by Le Mouellec et al.; WO 91/01 140 by 
Smithies et al.; WO 92/0968 by Zijlstra et al.; and WO 93/04169 by Berns et al. 

III. Isolated p62 Proteins and Anti-p62 Antibodies 

Another aspect of the invention pertains to isolated p62 polypeptides and active 

1 0 fragments or portions thereof, i.e., peptides having a p62 activity, such as human p62. 
This invention also provides a preparation of p62 or fragment or portion thereof. An 
"isolated" protein is substantially free of cellular material or culture medium when 
produced by recombinant DNA techniques, or chemical precursors or other chemicals 
when chemically synthesized. In a preferred embodiment, the p62 polypeptide has an 

15 amino acid sequence shown in Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4. In 
other embodiments, the p62 polypeptide is substantially homologous or similar to Figure 
2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4 and retains the functional activity of the 
polypeptide of Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4 yet differs in amino 
acid sequence due to natural allelic variation or mutagenesis, as described in detail in 

20 subsection I above. Accordingly, in another embodiment, the p62 polypeptide is a 

polypeptide which comprises an amino acid sequence at least about 70% overall amino 
acid identity with the amino acid sequence of Figure 2, SEQ ID NO;2 or Figure 4, SEQ 
ID NO:4. Preferably, the polypeptide is at least about 80%, more preferably at least 
about 90%, yet more preferably at least about 95%, and most preferably at least about 

25 98-99% identical to Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4. 

An isolated p62 polypeptide can comprise the entire amino acid sequence of 
Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4 or a biologically active portion or 
fragment thereof. For example, an active portion of p62 can comprise a selected domain 
of p62, such as the SH2 binding domain or the ubiquitin binding domain. Moreover, 

30 other biologically active portions, in which other regions of the protein are deleted, can 
be prepared by recombinant techniques and evaluated for a p62 activity as described in 
detail above. For example, a peptide having a p62 activity can differ in amino acid 
sequence from the human p62 depicted in Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID 
NO:4, but such differences result in a peptide which functions in the same or similar 

35 manner as p62. Thus, peptides having the ability to modulate T cell activity, such as by 
inducing IL-2 production or T cell proliferation or having the ability to inhibit ubiquitin- 
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mediated degradation of cell cycle regulatory proteins and which preferably have an 
SH2 binding domain and a ubiquitin binding domain. Preferred peptides of the 
invention include those which are further capable of modulating B cell activity such as 
by inducing B cell differentiation or stimulating B cell survival. 
5 A peptide can be produced by modification of the amino acid sequence of the 

human p62 polypeptide shown in Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID NO:4, 
such as a substitution, addition or deletion of an amino acid residue which is not directly 
involved in the function of p62. For example, in order to enhance stability and/or 
reactivity, the polypeptides or peptides of the invention can also be modified to 

10 incorporate one or more polymorphisms in the amino acid sequence of the protein 
allergen resulting from natural allelic variation. Additionally, D-amino acids, non- 
natural amino acids or non-amino acid analogues can be substituted or added to produce 
a modified protein or peptide within the scope of this invention. Furthermore, proteins 
or peptides of the present invention can be modified using the polyethylene glycol 

1 5 (PEG) method of A. Sehon and co-workers (Wie et al. supra) to produce a protein or 

peptide conjugated with PEG. In addition, PEG can be added during chemical synthesis 
of a protein or peptide of the invention. Modifications of proteins or peptides or 
portions thereof can also include reduction/alkylation (Tarr in: Methods of Protein 
Microcharacterization, J.E. Silver ed. Humana Press, Clifton, NJ, pp 155-194 (1986)); 

20 acylation (Tarr, supra)\ chemical coupling to an appropriate carrier (Mishell and Shiigi, 
eds, Selected Methods in Cellular Immunology, WH Freeman, San Francisco, CA 
(1980); U.S. Patent 4,939,239; or mild formalin treatment (Marsh International Archives 
of Allergy and Applied Immunology, 4\: 1 99-2 1 5 ( 1 97 1 )). 

To facilitate purification and potentially increase solubility of proteins or 

25 peptides of the invention, it is possible to add reporter group(s) to the peptide backbone. 
For example, poly-histidine can be added to a peptide to purify the peptide on 
immobilized metal ion affinity chromatography (Hochuli, E. et al., Bio/T echnology, 
6:1321-1325 (1988)). In addition, specific endoprotease cleavage sites can be 
introduced, if desired, between a reporter group and amino acid sequences of a peptide 

30 to facilitate isolation of peptides free of irrelevant sequences. 

Peptides of the invention are typically at least 30 amino acid residues in length, 
preferably at least 40 amino acid residues in length, more preferably at least 50 amino 
acid residues in length, and most preferably 60 amino acid residues in length. Peptides 
having p62 activity and including at least 80 amino acid residues in length, at least 100 

35 amino acid residues in length, at least about 200, at least about 300, at least about 400, 
or at least about 500 or more amino acid residues in length are also within the scope of 
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the invention. Other peptides within the scope of the invention include those encoded 
by the nucleic acids described herein. 

Another embodiment of the invention provides a substantially pure preparation 
of a peptide having a p62 activity. Such a preparation is substantially free of proteins 
5 and peptides with which the peptide naturally occurs in a cell or with which it naturally 
occurs when secreted by a cell. 

The term "isolated" as used throughout this application refers to a nucleic acid, 
protein or peptide having an activity of a p62 polypeptide substantially free of cellular 
material or culture medium when produced by recombinant DN A techniques, or 

10 chemical precursors or other chemicals when chemically synthesized. An isolated 
nucleic acid is also free of sequences which naturally flank the nucleic acid (i.e., 
sequences located at the 5' and 3* ends of the nucleic acid) in the organism from which 
the nucleic acid is derived. 

The peptides and fusion proteins produced from the nucleic acid molecules of the 

1 5 present invention can also be used to produce antibodies specifically reactive with p62 
polypeptides. For example, by using a full-length p62 polypeptide, such as an antigen 
having an amino acid sequence shown in Figure 2, SEQ ID NO:2 or Figure 4, SEQ ID 
NO:4, or a peptide fragment thereof, anti-protein/anti-peptide polyclonal antisera or 
monoclonal antibodies can be made using standard methods. A mammal, (e.g., a mouse, 

20 hamster, or rabbit) can be immunized with an immunogenic form of the protein or 

peptide which elicits an antibody response in the mammal. The immunogen can be, for 
example, a recombinant p62 polypeptide, or fragment or portion thereof or a synthetic 
peptide fragment. The immunogen can be modified to increase its immunogenicity. For 
example, techniques for conferring immunogenicity on a peptide include conjugation to 

25 carriers or other techniques well known in the art. For example, the peptide can be 
administered in the presence of adjuvant. The progress of immunization can be 
monitored by detection of antibody titers in plasma or serum. Standard ELISA or other 
immunoassay can be used with the immunogen as antigen to assess the levels of 
antibodies. 

30 Following immunization, antisera can be obtained and, if desired, polyclonal 

antibodies isolated from the sera. To produce monoclonal antibodies, antibody 
producing cells (lymphocytes) can be harvested from an immunized animal and fused 
with myeloma cells by standard somatic cell fusion procedures thus immortalizing these 
cells and yielding hybridoma cells. Such techniques are well known in the art. For 

35 example, the hybridoma technique originally developed by Kohler and Milstein {Nature 
(1975) 256:495-497) as well as other techniques such as the human B-cell hybridoma 
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technique (Kozbar et al., Immunol. Today (1983) 4:72), the EBV-hybridoma technique 
to produce human monoclonal antibodies (Cole et al. Monoclonal Antibodies in Cancer 
Therapy (1985) Allen R. Bliss, Inc., pages 77-96), and screening of combinatorial 
antibody libraries (Huse et al., Science (1989) 246:1275). Hybridoma cells can be 
5 screened immunochemical^ for production of antibodies specifically reactive with the 
peptide and monoclonal antibodies isolated. 

The term "antibody" as used herein is intended to include fragments thereof 
which are also specifically reactive with a peptide having the activity of a novel B 
lymphocyte antigen or fusion protein as described herein. Antibodies can be fragmented 

10 using conventional techniques and the fragments screened for utility in the same manner 
as described above for whole antibodies. For example, F(ab')2 fragments can be 
generated by treating antibody with pepsin. The resulting F(ab')2 fragment can be 
treated to reduce disulfide bridges to produce Fab' fragments. The antibody of the 
present invention is further intended to include bispecific and chimeric molecules having 

1 5 an anti-p62 polypeptide (i.e., p62) portion. 

When antibodies produced in non-human subjects are used therapeutically in 
humans, they are recognized to varying degrees as foreign and an immune response may 
be generated in the patient. One approach for minimizing or eliminating this problem, 
which is preferable to general immunosuppression, is to produce chimeric antibody 

20 derivatives, i.e., antibody molecules that combine a non-human animal variable region 
and a human constant region. Chimeric antibody molecules can include, for example, 
the antigen binding domain from an antibody of a mouse, rat, or other species, with 
human constant regions. A variety of approaches for making chimeric antibodies have 
been described and can be used to make chimeric antibodies containing the 

25 immunoglobulin variable region which recognizes the gene product of the novel p62 
polypeptides of the invention. See, e.g., Morrison et al., (1985), Proc. Natl. Acad. ScL 
USA. 81 :6851 ; Takeda et al., (1985), Nature 314:452 , Cabilly et al., U.S. Patent No. 
4,816,567; Boss et al., U.S. Patent No. 4,816,397; Tanaguchi et al., European Patent 
Publication EP171496; European Patent Publication 0173494, United Kingdom Patent 

30 GB 2 1 77096B. It is expected that such chimeric antibodies would be less immunogenic 
in a human subject than the corresponding non-chimeric antibody. 

For human therapeutic purposes, the monoclonal or chimeric antibodies 
specifically reactive with a p62 polypeptide as described herein can be further 
humanized by producing human variable region chimeras, in which parts of the variable 

35 regions, especially the conserved framework regions of the antigen-binding domain, are 
of human origin and only the hypervariable regions are of non-human origin. General 
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reviews of "humanized" chimeric antibodies are provided by Morrison, S. L. (1985) 
Science 229:1202-1207 and by Oi et al. (1986) BioTechnigues 4:214. Such altered 
immunoglobulin molecules may be made by any of several techniques known in the art. 
(e.g., Teng et aL, (1983), Proc. Natl Acad Set U.S.A., 80:7308-7312; Kozbor et aL, 
5 ( 1 983), Immunology Today, 4:7279; Olsson et aL, ( 1 982), Meth. Enzymol , 92:3- 16), and 
are preferably made according to the teachings of PCT Publication WO92/06193 or EP 
0239400. Humanized antibodies can be commercially produced by, for example, 
Scotgen Limited, 2 Holly Road, Twickenham, Middlesex, Great Britain. Suitable 
"humanized" antibodies can be alternatively produced by CDR or CEA substitution (see 

10 U.S. Patent 5,225,539 to Winter; Jones et al. (1986) Nature 321 :552-525; Verhoeyan et 
al. (1988) Science 239:1534; and Beidler et aL (1988)/ Immunol 141:4053-4060). 
Humanized antibodies which have reduced immunogenicity are preferred for 
immunotherapy in human subjects. Immunotherapy with a humanized antibody will 
likely reduce the necessity for any concomitant immunosuppression and may result in 

1 5 increased long term effectiveness for the treatment of chronic disease situations or 
situations requiring repeated antibody treatments. 

As an alternative to humanizing a monoclonal antibody from a mouse or other 
species, a human monoclonal antibody directed against a human protein can be 
generated. Transgenic mice carrying human antibody repertoires have been created 

20 which can be immunized with a p62 polypeptide, such as human p62. Splenocytes from 
these immunized transgenic mice can then be used to create hybridomas that secrete 
human monoclonal antibodies specifically reactive with a p62 polypeptide (see, e.g., 
Wood et al. PCT publication WO 91/00906, Kucherlapati et aL PCT publication WO 
91/10741; Lonberg et al. PCT publication WO 92/03918; Kay et al. PCT publication 

25 92/03917; Lonberg, N. et al. (1994) Nature 368:856-859; Green, L.L. et al. (1994) 

Nature Genet. 7:13-21; Morrison, S.L. et al. (1994) Proc. Natl Acad Sci. USA 81:6851- 
6855; Bruggeman et al. (1993) Year Immunol 7:33-40; Tuaillon et al. (1993) Proc Natl 
Acad ScL USA 90:3720-3724; and Bruggeman et aL (1991) Eur J Immunol 21:1 323- 
1326). 

30 Monoclonal antibody compositions of the invention can also be produced by 

other methods well known to those skilled in the art of recombinant DNA technology. 
An alternative method, referred to as the "combinatorial antibody display" method, has 
been developed to identify and isolate antibody fragments having a particular antigen 
specificity, and can be utilized to produce monoclonal antibodies that bind a p62 

35 polypeptide of the invention (for descriptions of combinatorial antibody display see e.g., 
Sastry et al. (1989) PNAS 86:5728; Huse et al. (1989) Science 246: 1275; and Orlandi et 
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al. (1989) PNAS 86:3833). After immunizing an animal with a p62 polypeptide, the 
antibody repertoire of the resulting B-cell pool is cloned. Methods are generally known 
for directly obtaining the DNA sequence of the variable regions of a diverse population 
of immunoglobulin molecules by using a mixture of oligomer primers and PGR. For 
5 instance, mixed oligonucleotide primers corresponding to the 5' leader (signal peptide) 
sequences and/or framework 1 (FR1) sequences, as well as primer to a conserved 3' 
constant region primer can be used for PCR amplification of the heavy and light chain 
variable regions from a number of murine antibodies (Larrick et al. (1991 ) 
Biotechniques 11:1 52-1 56). A similar strategy can also been used to amplify human 

1 0 heavy and light chain variable regions from human antibodies (Larrick et al. (1 991 ) 
Methods: Companion to Methods in Enzymology 2:106-110). 

In an illustrative embodiment, RNA is isolated from activated B cells of, for 
example, peripheral blood cells, bone marrow, or spleen preparations, using standard 
protocols (e.g., U.S. Patent No. 4,683.202; Orlandi, et al. PNAS (I 989) 86:3833-3837; 

1 5 Sastry et ah, PNAS (1 989) 86:5728-5732; and Huse et al. (1 989) Science 246: 1275- 

1281 .) First-strand cDNA is synthesized using primers specific for the constant region 
of the heavy chain(s) and each of the k and A. light chains, as well as primers for the 
signal sequence. Using variable region PCR primers, the variable regions of both heavy 
and light chains are amplified, each alone or in combination, and ligated into appropriate 

20 vectors for further manipulation in generating the display packages. Oligonucleotide 
primers useful in amplification protocols may be unique or degenerate or incorporate 
inosine at degenerate positions. Restriction endonuclease recognition sequences may 
also be incorporated into the primers to allow for the cloning of the amplified fragment 
into a vector in a predetermined reading frame for expression. 

25 The V-gene library cloned from the immunization-derived antibody repertoire 

can be expressed by a population of display packages, preferably derived from 
filamentous phage, to form an antibody display library. Ideally, the display package 
comprises a system that allows the sampling of very large diverse antibody display 
libraries, rapid sorting after each affinity separation round, and easy isolation of the 

30 antibody gene from purified display packages. In addition to commercially available 
kits for generating phage display libraries (e.g., the Pharmacia Recombinant Phage 
Antibody System, catalog no. 27-9400-01 ; and the Stratagene SurJZAP™ phage display 
kit, catalog no. 240612), examples of methods and reagents particularly amenable for 
use in generating a diverse antibody display library can be found in, for example, Ladner 

35 et al. U.S. Patent No. 5,223,409; Kang et al. International Publication No. WO 
92/18619; Dower et al. International Publication No. WO 91/17271 ; Winter et al. 
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International Publication WO 92720791; Markland et al. International Publication No. 
WO 92/15679; Breitling et al. International Publication WO 93/01288; McCafferty et al. 
International Publication No. WO 92/01047; Garrard et al. International Publication No. 
WO 92/09690; Ladner et al. International Publication No, WO 90/02809; Fuchs et al. 
5 (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antihod Hybridomas 3:81- 
85; Huse et al. (1989) Science 246:1275-1281; Grifflhs et al. (1993) EMBO J 12:725- 
734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 
352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad et al. (1991) 
Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133-4137; 

10 and Barbas et al. (1991) PNAS 88:7978-7982. 

In certain embodiments, the V region domains of heavy and light chains can be 
expressed on the same polypeptide, joined by a flexible linker to form a single-chain Fv 
fragment, and the scFV gene subsequently cloned into the desired expression vector or 
phage genome. As generally described in McCafferty et al., Nature (1990) 348:552- 

1 5 554, complete and Vl domains of an antibody, joined by a flexible (Gly4~Ser)3 
linker can be used to produce a single chain antibody which can render the display 
package separable based on antigen affinity. Isolated scFV antibodies immunoreactive 
with a peptide having activity of a p62 polypeptide can subsequently be formulated into 
a pharmaceutical preparation for use in the subject method. 

20 Once displayed on the surface of a display package (e.g., filamentous phage), the 

antibody library is screened with a p62 polypeptide, or peptide fragment thereof, to 
identify and isolate packages that express an antibody having specificity for the p62 
polypeptide. Nucleic acid encoding the selected antibody can be recovered from the 
display package (e.g., from the phage genome) and subcloned into other expression 

25 vectors by standard recombinant DNA techniques. 

The polyclonal or monoclonal antibodies of the current invention, such as an 
antibody specifically reactive with a recombinant or synthetic peptide having a p62 
activity can also be used to isolate the native p62 polypeptides from cells. For example, 
antibodies reactive with the peptide can be used to isolate the naturally-occurring or 

30 native form of p62 from, for example, B cells by immunoaffinity chromatography. In 
addition, the native form of cross-reactive p62-like molecules can be isolated from B 
cells or other cells by immunoaffinity chromatography with an anti-p62 antibody. 
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IV. Uses and Methods of the Invention 

The invention further pertains to methods for inhibiting cell proliferation in a 
subject. These methods include administering to the subject a therapeutically effective 
amount of an agent which modulates p62 expression such that p62 expression is 
5 stimulated. Alternative methods for inhibiting cell proliferation in a subject include 
administering to the subject a therapeutically effective amount of a p62 polypeptide or 
fragment thereof or a vector comprising a nucleic acid molecule encoding a p62 
polypeptide or fragment thereof The term "inhibiting" as used herein refers to 
prevention, retardation, and/or termination of cell proliferation. As used herein, the 

10 phrase "cell proliferation" includes cell reproduction by, for example, cell division. Cell 
proliferation can be associated with normal cellular reproduction or can be associated 
with abnormal cellular reproduction, such as neoplasia. Subjects who can be treated by 
the method of this invention include living organisms, e.g. mammals. Examples of 
preferred subjects are those who have or are susceptible to unwanted cell proliferation, 

1 5 e.g., cell proliferation associated with neoplasia, e.g., neoplasia associated with p53 
deregulation. Agents which modulate p62 expression, p62 polypeptides, and vectors 
containing nucleic acid encoding p62 polypeptides can be administered to the subject by 
a route of administration which allows the agent, polypeptide, or vector to perform its 
intended function. Various routes of administration are described herein in the section 

20 entitled "Pharmaceutical Compositions". Administration of a therapeutically active or 
therapeutically effective amount of an agent, polypeptide, or vector of the present 
invention is defined as an amount effective, at dosages and for periods of time necessary 
to achieve the desired result. Other methods of the invention include methods for 
promoting cell proliferation in a subject. In one embodiment, these methods include 

25 administering to the subject a therapeutically effective amount of an agent which 

modulates p62 expression such that p62 expression is inhibited. In other embodiments, 
these methods include administering to the subject a therapeutically effective amount of 
an inhibitor of a p62 polypeptide such as a nucleic acid molecule which is antisense to a 
nucleic acid molecule encoding a p62 polypeptide or an antibody which binds a p62 

30 polypeptide. The term "promoting" as used herein refers to activation or inducement of 
cell proliferation. In certain instances, it is desirable to promote cell proliferation. For 
example, promotion of cell proliferation would be desirable to promote would healing or 
to promote hair growth. 

Still other methods of the present invention include methods for treating cancer, 

35 e.g., cancer associated with inhibition or deregulation of the tumor suppressor p53, e.g., 
cervical cancer, e.g., HPV-induced cervical cancer, in a subject. These methods include 
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administering to the subject a therapeutically effective amount of a p62 polypeptide or 
fragment thereof, a therapeutically effective amount of a vector comprising a nucleic 
acid molecule encoding a p62 polypeptide, or a therapeutically effective amount of an 
agent which modulates p62 expression. 
5 In one embodiment, the methods of the invention can used to treat cervical 

cancer, specifically cervical cancer induced by HPV, e.g. HPV-1, HPV -2, HPV-3, HPV- 
4, HPV-5, HPV-6, HPV-7, HPV-8, HPV-9, HPV-10, HPV-1 1, HPV- 12, HPV-14, HPV- 
13, HPV-15, HPV-16, HPV-17 or HPV-18, and particularly high-risk HPVs, such as 
HPV- 1 6, HPV-1 8, HPV-3 1 and HP V-33. The papillomaviruses (PV) are infectious 

10 agents that can cause benign epithelial tumors, or warts, in their natural hosts. Infection 
with specific HPVs has been associated with the development of human epithelial 
malignancies, including that of the uterine cervix, genitalia, skin and less frequently, 
other sites. Two of the transforming proteins produced by papillomaviruses, the E6 
protein and E7 protein, form complexes with the tumor suppressor gene products p53 

1 5 and Rb, respectively, indicating that these viral proteins may exert their functions 

through critical pathways that regulate cellular growth control. Such agents can be of 
use therapeutically to prevent E6-AP/E6 complexes in cells infected by, for example, 
human papillomaviruses, e.g. HPV-1, HPV-2, HPV-3. HPV-4, HPV-5, HPV-6, HPV-7, 
HPV-8, HPV-9, HPV-10, HPV-1 1, HPV-12, HPV-14, HPV-13, HPV-1 5, HPV-16, 

20 HPV-17 or HPV-18, particularly high-risk HPVs, such as HPV-16, HPV-18, HPV-31 
and HPV-3 3. Contacting such cells with agents that alter the formation of one or more 
E6-BP/E6 complexes can inhibit pathological progression of papillomavirus infection, 
such as preventing or reversing the formation of warts, e.g. Plantar warts (verruca 
plantaris), common warts (verruca plana), Butcher's common warts, flat warts, genital 

25 warts (condyloma acuminatum), or epidermodysplasia verruciformis; as well as treating 
papillomavirus cells which have become, or are at risk of becoming, transformed and/or 
immortalized, e.g. cancerous, e.g. a laryngeal papilloma, a focal epithelial, a cervical 
carcinoma. 

Further methods of the invention include methods for modulating T cell activity 
30 in a subject comprising administering to the subject a therapeutically effective amount of 
an agent which modulates p62 expression. Alternative methods for modulating T cell 
activity in a subject include administering to the subject a therapeutically effective 
amount of an agent which activates or inhibits a p62 polypeptide. Similar methods can 
be employed for modulating B cell activity. The term "modulate" as used herein refers 
35 to inhibition or activation/stimulation of a cell, e.g., a leukocyte. The term "leukocyte" 
is intended to include a cell of the blood which is not a red blood cell and includes 
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lymphocytes, granulocytes, and monocytes. A preferred leukocyte is a lymphocyte, 

such as a B cell or a T cell. 

T cell activity can be modulated, e.g., stimulated, in the methods of the present 
invention. T cell activation refers to a T cell response such as T cell proliferation, T 
5 cytotoxic activity, secretion of cytokines, differentiation or any T cell effector function. 
The term "T cell activation" is used herein to define a state in which a T cell response 
has been initiated or activated by a primary signal, such as through the TCR/CD3 
complex, but not necessarily due to interaction with a protein antigen. A T cell is 
activated if it has received a primary signaling event which initiates an immune response 

10 by the T cell. 

T cell activation can be accomplished by stimulating the T cell TCR/CD3 
complex or via stimulation of the CD2 surface protein. An anti-CD3 monoclonal 
antibody can be used to activate a population of T cells via the TCR/CD3 complex. 
Although a number of anti-human CD3 monoclonal antibodies are commercially 

1 5 available, OKT3 prepared from hybridoma cells obtained from the American Type 

Culture Collection or monoclonal antibody G19-4 is preferred. Similarly, binding of an 
anti-CD2 antibody will activate T cells. Stimulatory forms of anti-CD2 antibodies are 
known and available. Stimulation through CD2 with anti-CD2 antibodies is typically 
accomplished using a combination of at least two different anti-CD2 antibodies. 

20 Stimulatory combinations of anti-CD2 antibodies which have been described include the 
following: the Tl 1 .3 antibody in combination with the Tl 1 .1 or Tl 1 .2 antibody (Meuer, 
S.C. et al. (1984) Cell 36:897-906) and the 9.6 antibody (which recognizes the same 
epitope as Tl 1 . 1 ) in combination with the 9- 1 antibody (Yang, S. Y. et al. ( 1 986) J. 
Immunol. 137:1097-1 100). Other antibodies which bind to the same epitopes as any of 

25 the above described antibodies can also be used. Additional antibodies, or combinations 
of antibodies, can be prepared and identified by standard techniques. 

A primary activation signal can also be provided by a polyclonal activator. 
Polyclonal activators include agents that bind to glycoproteins expressed on the plasma 
membrane of T cells and include lectins, such as phytohemaglutinin (PHA), 

30 concanavalin (Con A) and pokeweed mitogen (PWM). 

A primary activation signal can also be delivered to a T cell through use of a 
combination of a protein kinase C (PKC) activator such as a phorbol ester (e.g., phorbol 
myristate acetate) and a calcium ionophore (e.g., ionomycin which raises cytoplasmic 
calcium concentrations). The use of these agents bypasses the TCR/CD3 complex but 

35 delivers a stimulatory signal to T cells. These agents are also known to exert a 
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synergistic effect on T cells to promote T cell activation and can be used in the absence 
of antigen to deliver a primary activation signal to T cells. 

The term "B cell" is intended to include a B lymphocyte that is at any state of 
maturation. Thus, the B cell can be a progenitor cell, a pre-B cell, an immature B cell, a 
5 mature B cell, a blast cell, a centroblast, a centrocyte, an activated B cell, a memory B 
cell, or an antibody secreting plasma cell. A preferred B cell is an activated B cell, i.e., a 
B cell which has encountered an antigen. The term "B cell response" is intended to 
include a response of a B cell to a stimulus. The stimulus can be a soluble stimulus such 
as an antigen, a lymphokine, or a growth factor or a combination thereof. Alternatively, 

1 0 the stimulus can be a membrane bound molecule, such as a receptor on T helper (Th) 
cells, e.g., CD28, CTLA4, gp39, or an adhesion molecule. Since a change in a B cell, 
such as a change occuring during the process of B cell maturation or activation is 
mediated by extracellular factors and membrane bound molecules, a response of a B cell 
is intended to include any change in a B cell, such as a change in stage of differentiation, 

15 secretion of factors, e.g., antibodies. Thus, a modulation of a B cell response can be a 
modulation of B cell aggregation, a modulation of B cell differentiation, such as 
differentiation into a plasma cell or into a memory B cell, or a modulation of cell 
viability. In a preferred embodiment, the invention provides a method for stimulating 
the differentiation of a B cell from a lymphoblast to a centrocyte. In another preferred 

20 embodiment, the invention provides a method for modulating B cell aggregation, such as 
homotypic B cell aggregation. In another embodiment, the invention provides a method 
for modulating B cell survival. In yet another preferred embodiment, the invention 
provides a method for modulating production of antibodies by B cells. In a further 
embodiment, the invention provides a method for modulating proliferation of B cells. 

25 Other aspects of the invention pertain to methods for identifying agents which 

modulate, e.g., inhibit or activate/stimulate, a p62 polypeptide or expression thereof. 
Also contemplated by the invention are the agents which modulate, e.g., inhibit or 
activate/stimulate p62 polypeptides or p62 polypeptide expression and which are 
identified according to methods of the present invention. In one embodiment, these 

30 methods include contacting a first polypeptide comprising an SH2 domain of p56 ,ck 
with a second polypeptide comprising a p62 polypeptide and an agent to be tested and 
determining binding of the second polypeptide to the first polypeptide. Inhibition of 
binding of the first polypeptide to the second polypeptide indicates that the agent is an 
inhibitor of a p62 polypeptide. Activation of binding of the first polypeptide to the 

35 second polypeptide indicates that the agent is an activator/stimulator of a p62 
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polypeptide. Methods for testing the binding of an agent to the SH2 domain of p56 Ick 
are described herein. 

In another embodiment, these methods include contacting a p53 protein, p53 
analog, derivative or active fragment, under conditions which promote ubiquitination of 
5 the p53 protein, p53 analog, derivative or active fragment, with an agent to be tested and 
determining p53 ubiquitination level in the presence of the agent. An activation of p53 
ubiquitination indicates that the agent is an inhibitor of a p62 polypeptide. An inhibition 
of pS3 ubiquitination indicates that the agent is an activator of a p62 polypeptide. To 
measure p53 ubiquitination, a skilled artisan can follow the protocol set forth in 

1 0 Scheffner et al. ( 1 993) Cell 75:495. In particular, p53 ubiquitination can measured by 
using in vitro translated human wild type p53 as a p53 source. Human E6AP, papilloma 
E6 and HeLa p62 can then be expressed as GST fusion proteins in Exoli. Other 
components used in the system to measure p53 ubiquitination include El and UBC8, 
which can be expressed in Kcoli using a pET expression system as previously described 

15 (Hatfield and Vierstra( 1992) J, Biol Chem. 267:14799). A 50 ml total reaction mixture 
typically contains 4 ml of p53, 100-200ng of E6, p62, E6AP, El and UBC8 in a reaction 
buffer. The reaction buffer typically includes 25mM Tris, pH7.5, 50mM NaCl, 5mM 
MgCl2, 0.1 mM DTT, 5 mM ubiquitin, and 5 mMATPgS. The reaction mixture is 
generally incubated at 30°C for two hours and stopped with the addition of SDS-buffer. 

20 The reaction products are separated on a 10% SDS-PAGE gel and visualized by 
fluorography to determine ubiquitination of p53. 

In yet another embodiment, these methods include contacting a first polypeptide 
comprising ubiquitin, a ubiquitin analog, derivative or active fragment, with a second 
polypeptide comprising a p62 polypeptide and an agent to be tested and determining 

25 binding of the second polypeptide to the first polypeptide. Inhibition of binding of the 
first polypeptide to the second polypeptide indicates that the agent is an inhibitor of a 
p62 polypeptide. Activation of binding of the first polypeptide to the second 
polypeptide indicates that the agent is an activator of a p62 polypeptide. Methods for 
testing the binding of an agent to ubiquitin are described herein. 

30 In yet another embodiment, these methods include contacting a first polypeptide 

comprising a p53 protein, p53 analog, derivative or active fragment, with a second 
polypeptide comprising a p62 polypeptide and an agent to be tested, measuring the level 
of p53 degradation in the presence of the agent, and comparing the level of p53 
degradation in the presence of the agent to level of p53 degradation in the absence of the 

35 agent. An increase in the level of p53 degradation in the presence of the agent indicates 
that the agent is an inhibitor of a p62 polypeptide. A decrease in the level of p53 
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degradation in the presence of the agent indicates that the agent is an activator of a p62 
polypeptide. p53 degradation can be measured using the method described in Scheffner 
et al. (1990) Cell 63:1 129-1 136). For example, p53 degradation can be measured by 
using two milliliters of in vitro translated human wild type p53 and ten milliliters of 
5 papilloma virus E6-GST fusion protein incubated together at 25°C for three hours in 
25mM Tris, pH 7.5, 50mM NaCl and 2mM DTT. Reaction mixtures also contain a total 
of about ten milliliters of rabbit reticulolysate per forty milliliters of reaction mixture. 
The reactions are stopped with the addition of SDS-buffer and samples are separated on 
10% SDS-PAGE gels and visualized by fluorography to determine p53 degradation. 
1 0 p53 degradation can also be measured using a reaction mixture which include E6 and 
E6AP-supplemented wheat-germ lysate or a reaction mixture containing purified El, 
appropriate E2, E6, and E6AP. Scheffner et al. (1993) Cell 75:495-505. 

V. pi 60 Nucleic Acids. Polypeptides, and Methods of Use 

1 5 As described herein, the present invention is also based on the discovery of a 

second family of polypeptides, designated herein as pi 60 polypeptides. The pi 60 
polypeptides act downstream from the p62 polypeptides. Specifically, pi 60 
polypeptides of the invention are capable of binding to the p62/p56 lck complex to 
thereby modulate Lck function in a similar manner as described herein for the p62 

20 polypeptides. The pi 60 polypeptides activate transcription, pi 60 polypeptides include 
leucine zipper domains which are found in some transcription factors, e.g., jun, fos, myc, 
CEBP, etc. The leucine zipper domain in the 160.1 polypeptide comprises amino acids 
3 to 138 of the amino acid sequence of Figure 9, SEQ ID NO:7 (encoded by nucleotides 
447-888 of the nucleotide sequence of Figure 8, SEQ ID NO:6) and the leucine zipper 

25 domain of the pi 60.2 polypeptide comprises amino acids 3 to 138 of the amino acid 

sequence of Figure 1 1, SEQ ID NO:9 (encoded by nucleotides 447-888 of the nucleotide 
sequence of Figure 10, SEQ ID NO:8). The pi 60 polypeptides also include 
proline/lysine rich and glutamic acid rich regions. For example, the pi 60.1 polypeptide 
includes a proline/lysine rich region at amino acid residues 740 to 868 of the amino acid 

30 sequence of Figure 9, SEQ ID NO:7 (encoded by nucleotides 2656 to 3042 of the 

nucleotide sequence of Figure 8, SEQ ID NO:6). The pi 60.2 polypeptide includes a 
proline/lysine rich region at amino acid residues 510 to 638 of the amino acid sequence 
of Figure 1 1, SEQ ID NO:9 (encoded by nucleotides 1966 to 2352 of the nucleotide 
sequence of Figure 10, SEQ ID NO:8). The glutamic acid rich regions of the pl60.1 and 

35 pi 60.2 polypeptides appear at amino acid residues 884 to 1 100 of the amino acid 
sequence of Figure 9, SEQ ID NO:7 (encoded by nucleotides 3088 to 3732 of the 
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nucleotide sequence of Figure 8, SEQ ID NO:6) and 654 to 870 of the amino acid 
sequence of Figure 1 1, SEQ ID NO:9 (encoded by nucleotides 2398 to 3032 of the 
nucleotide sequence of Figure 10, SEQ ID NO:8). 

The pi 60 polypeptides also contain regions which are homologous to regions 
5 found in other transcription factors such as oct-2. Specifically, the pi 60 polypeptides 
activate transcription of a variety of genes upon, for example, activation of p62. The 
genes which are transcribed in response to pi 60 activation likely include those which are 
involved in T or B cell development/differentiation, T or B cell activation, and 
production of T or B cell-specific factors, e.g., lymphokines and antibodies, respectively. 

1 0 The p 1 60 polypeptides of the present invention have also been found to be substrates for 
serine/threonine kinase activity. A plasmid containing the full length nucleotide 
sequence (as shown in Figure 8, SEQ ID NO:6) encoding the first pi 60 polypeptide 
(also designated herein as pi 60.1) was deposited with the American Type Culture 
Collection (ATCC) on December 19, 1995 and was assigned ATCC Accession Number 

1 5 97385. A second plasmid containing the full length nucleotide sequence (as shown in 
Figure 10, SEQ ID NO:8) encoding the second pi 60 polypeptide (also designated herein 
as pi 60.2) was deposited with the American Type Culture Collection (ATCC) and was 
assigned ATCC Accession Number 97384. A comparison of the nucleotide sequences 
of the first pi 60 polypeptide and the second pi 60 polypeptide is shown in Figure 1 8. A 

20 comparison of the amino acid sequences of the first pi 60 polypeptide and the second 
pl60 polypeptide is shown in Figure 19. 

Accordingly, the present invention pertains to isolated nucleic acid molecules 
comprising a nucleotide sequence, or a portion or fragment thereof, shown in Figure 8, 
SEQ ID NO:6 or Figure 10, SEQ ID NO:8 or have at least about 60%, more preferably 

25 at least about 70%, yet more preferably at least about 80%, and most preferably 90% or 
more overall sequence identity with the nucleotide sequence shown in Figure 8 T SEQ ID 
NO:6 or Figure 10, SEQ ID NO:8 or a portion or fragment thereof. These nucleotide 
sequences represent two isoforms of the pi 60 nucleic acid. The second pi 60 
polypeptide, pi 60.2 is missing two exons which are included in the first pi 60 

30 polypeptide, p 1 60. 1 . These exons are located at amino acid residues 2 1 0-354 of Figure 
9, SEQ ID NO:7, which are encoded by nucleotides 1066-1500 of Figure 8, SEQ ID 
NO:6 and at amino acid residues 508-592 of Figure 9, SEQ ID NO:7, which are encoded 
by nucleotides 1959-2213 of Figure 8, SEQ ID NO:6. In other embodiments, the 
isolated nucleic acid molecules comprise nucleotide sequences which encode an amino 

35 acid sequence, or portion or fragment thereof, shown in Figure 9, SEQ ID NO:7 or 

Figure 11, SEQ ID NO:9 or have at least about 60%, more preferably at least about 70%, 



WO 97/22255 



PCT/US96/19944 



-50- 

yet more preferably at least about 80%, and most preferably 90% or more overall 
sequence identity with the amino acid sequence, or portion or fragment thereof, shown 
in Figure 9, SEQ ID NO:7 or Figure IK SEQ ID NO:9. The pi 60 nucleic acid 
molecules of the present invention can be contained within vectors as described herein. 
5 Such vectors can be introduced into host cells as described herein. 

The present invention also pertains to isolated polypeptides having a pi 60 
activity, pi 60 activities parallel the activities set forth herein for p62. Thus* 
polypeptides having pi 60 activity can have one or more of the activities set forth herein 
for p62 polypeptides. Preferred polypeptides include those which comprise an amino 

10 acid sequence shown in Figure 9, SEQ ID NO:7 or Figure 1 1 , SEQ ID NO:9 or a 
fragment or portion thereof. The pi 60 polypeptides of the present invention can be 
included in fusion proteins, used to generate antibodies, and used in methods for 
modulating cell proliferation, methods for modulating leukocyte activity, and methods 
for identifying modulators of pi 60 polypeptides as described herein for p62 

1 5 polypeptides. 

VI. Applications of the Invention 

The invention provides a method for modulating B cell activity in a subject. In 
one embodiment, the invention provides a method for stimulating a B cell response. 

20 Stimulation of a B cell response can result in increased B cell aggregation, increased B 
cell differentiation and/or increased B cell survival. The B cells can, for example, be 
stimulated to differentiate from a lymphoblast to a centroblast or centrocyte and thereby 
stimulate the differentiation of B cells into either antibody secreting plasma cells or 
memory B cells. In another embodiment, the invention provides a method for 

25 stimulating a T cell response, such as T cell proliferation. In a preferred embodiment, 
the invention provides a method for stimulating a B cell response and a T cell response, 
such as T cell proliferation. It will be appreciated that it is particularly advantageous to 
stimulate both B cells and T cells for most applications. 

A p62 polypeptide or an agent which stimulates a p62 polypeptide or expression 

30 thereof can also be used for treating disorders in which boosting of a B cell response is 
beneficial. Such disorders include infections by pathogenic microorganisms, such as 
bacteria, viruses, and protozoans. Preferred disorders for treating according to the 
method of the invention include extracellular bacterial infections, wherein bacteria are 
eliminated through opsonization and phagocytosis or through activation of the 

35 complement. Other preferred infections that can be treated according to the method of 
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the invention include viral infections, including infections with an Epstein-Barr virus or 
retroviruses, e.g., a human immunodeficiency virus. 

In another embodiment of the invention, p62 polypeptides and/or agents which 
stimulate p62 polypeptides can be administered to a subject having an antibody 
5 deficiency disorder resulting, for example, in recurrent infections and 

hypogammaglobulinemia (Ochs et al. (1989) Disorders in Infants and Children, Stiehm 
(ed.) Philadelphia, W.B. Sanders, pp 226-256). These disorders include common 
variable immunodeficiency (CVI), hyper-IgM syndrome (HIM), and X-linked 
agammaglobulinemia (XLA). Some of these disorders, e.g., HIS, are caused by a 
10 mutation in the CD40 ligand, gp39, on the T cell and administration of a p62 

polypeptide or an agent which stimulates a p62 polypeptide or expression thereof would 
thus compensate for at least some of the B cell deficiencies, such as stimulation of B cell 
differentiation. 

Furthermore, upregulation of a B cell response is also useful for treating a 

15 subject with a tumor. In one embodiment, a p62 polypeptide or an agent which 
stimulates a p62 polypeptide is administered at the site of the tumor. In another 
embodiment, a p62 polypeptide and/or an agent which stimulates a p62 polypeptide is 
administered systemically. 

In another embodiment, the invention provides a method for stimulating B cells 

20 in culture, such as hybridoma cells. In a preferred embodiment, stimulation of the 

population of B cells results in increased antibody production. Thus, a p62 polypeptide 
or an agent which stimulates a p62 polypeptide can be added at an effective dose to a B 
cell culture, such as a hybridoma, such that antibody production by the B cells is 
enhanced. The effective dose of the p62 polypeptide or the agent which stimulates a p62 

25 polypeptide to be added to the culture can easily be determined experimentally. This 
can be done, for example, by adding various amounts of the polypeptide or agent to a 
constant amount of B cells, and by monitoring the amount of antibody produced, e.g., by 
ELISA. The effective dose corresponds to the dose at which highest amounts of 
antibodies are produced. 

30 In yet another embodiment, a p62 polypeptide or an agent which stimulates a p62 

polypeptide is administered together with a hybridoma into the peritoneal cavity of a 
mouse, such that the amount of antibody produced by the hybridoma is increased. 

In another embodiment of the invention, a T cell is contacted with a p62 
polypeptide or an agent which stimulates a p62 polypeptide and a primary activation 

35 signal, such that T cell proliferation is increased. The primary activation signal can be 
an antigen, or a combination of antigens, such that proliferation of one or more clonal 



WO 97/22255 



PCT/US96/19944 



-52- 

populations of T cells is stimulated. Alternatively the primary activation signal can be a 
polyclonal agent, such as an antibody to CD3, such that T cell proliferation is stimulated 
in a non clonal manner. 

In one embodiment, the invention provides a method for expanding a population 
5 of T cells ex vivo. Accordingly, primary T cells obtained from a subject are incubated 
with a p62 polypeptide or an agent which stimulates a p62 polypeptide and a primary 
activation signal. Following activation and stimulation of the T cells, the progress of 
proliferation of the T cells in response to continuing exposure to the p62 polypeptide or 
the agent which stimulates a p62 polypeptide is monitored. When the rate of T cell 

10 proliferation decreases, the T cells are reactivated and restimulated, such as with 

additional anti-CD3 antibody and a p62 polypeptide or an agent which stimulates a p62 
polypeptide in the T cell, to induce further proliferation. The monitoring and 
restimulation of the T cells can be repeated for sustained proliferation to produce a 
population of T cells increased in number from about 100- to about 100,000- fold over 

1 5 the original T cell population. Methods for stimulating the expansion of a population of 
T cells are further described in the published PCT application PCT/US94/06255. 

The method of the invention can be used to expand selected T cell populations 
for use in treating an infectious disease or cancer. The resulting T cell population can be 
genetically transduced and used for immunotherapy or can be used for in vitro analysis 

20 of infectious agents such as HIV. Proliferation of a population of CD4 + cells obtained 
from an individual infected with HIV can be achieved and the cells rendered resistant to 
HIV infection. Following expansion of the T cell population to sufficient numbers, the 
expanded T cells are restored to the individual. The expanded population of T cells can 
further be genetically transduced before restoration to a subject. Similarly, a population 

25 of tumor-infiltrating lymphocytes can be obtained from an individual afflicted with 

cancer and the T cells stimulated to proliferate to sufficient numbers and restored to the 
individual. In addition, supernatants from cultures of T cells expanded in accordance 
with the method of the invention are a rich source of cytokines and can be used to 
sustain T cells in vivo or ex vivo. 

30 In another embodiment of the invention, T cell proliferation is stimulated in vivo. 

In a preferred embodiment, a p62 polypeptide or an agent which stimulates a p62 
polypeptide in the T cell is administered to a subject, such that T cell proliferation in the 
subject is stimulated. The subject can be a subject that is immunodepressed, a subject 
having a tumor, or a subject infected with a pathogen. The agent of the invention can be 

35 administered locally or systemically. The agent can be administered in a soluble form or 
a membrane bound form. Additional applications for an agent capable of providing a 
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costimulatory signal to T cells, such that their proliferation is stimulated, are described 
in the published PCT applications PCT/US94/13782 and PCT/US94/08423, the content 
of which are incorporated herein by reference. 

Inhibitors of p62 can also be used to reduce B cell and/or T cell responses in 
5 autoimmune diseases which involve autoreactive B and/or T cells. Accordingly, 
administration of an inhibitor of p62 to a subject can be used for treating a variety of 
autoimmune diseases and disorders having an autoimmune component, including 
diabetes mellitus, arthritis (including rheumatoid arthritis, juvenile rheumatoid arthritis, 
osteoarthritis, psoriatic arthritis), multiple sclerosis, myasthenia gravis, systemic lupus 

10 erythematosis, autoimmune thyroiditis, dermatitis (including atopic dermatitis and 
eczematous dermatitis), psoriasis, Sjogren's Syndrome, including keratoconjunctivitis 
sicca secondary to Sjogren's Syndrome, alopecia areata, allergic responses due to 
arthropod bite reactions, Crohn's disease, aphthous ulcer, iritis, conjunctivitis, 
keratoconjunctivitis, ulcerative colitis, asthma, allergic asthma, cutaneous lupus 

15 erythematosus, scleroderma, vaginitis, proctitis, drug eruptions, leprosy reversal 

reactions, erythema nodosum leprosum, autoimmune uveitis, allergic encephalomyelitis, 
acute necrotizing hemorrhagic encephalopathy, idiopathic bilateral progressive 
sensorineural hearing loss, aplastic anemia, pure red cell anemia, idiopathic 
thrombocytopenia, polychondritis, Wegener's granulomatosis, chronic active hepatitis, 

20 Stevens-Johnson syndrome, idiopathic sprue, lichen planus, Crohn's disease, Graves 
ophthalmopathy, sarcoidosis, primary biliary cirrhosis, uveitis posterior, and interstitial 
lung fibrosis. 

The efficacy of a p62 inhibitor in preventing or alleviating autoimmune disorders 
can be determined using a number of well-characterized animal models of human 
25 autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine 
autoimmune collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine 
experimental myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, 
New York, 1989, pp. 840-856). 

30 

VII. Pharmaceutical Compositions 

The p62 polypeptides, portions or fragments thereof, and other agents described 
herein can be incorporated into pharmaceutical compositions suitable for administration* 
Such compositions typically comprise the polypeptide, a portion or fragment thereof, or 
35 agent and a pharmaceutically acceptable carrier. As used herein the term 

"pharmaceutical!)' acceptable carrier" is intended to include any and all solvents, 
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dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption 
delaying agents, and the like, compatible with pharmaceutical administration. The use 
of such media and agents for pharmaceutical ly active substances is well known in the 
art. Except insofar as any conventional media or agent is incompatible with the active 
5 compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 

In one embodiment, the agents of the invention can be administered to a subject 
to modulate a B cell response in the subject, e.g., for stimulating the clearance of a 
pathogen from the subject. The agents are administered to the subjects in a biologically 

10 compatible form suitable for pharmaceutical administration in vivo. By "biologically 
compatible form suitable for administration in v/vo" is meant a form of the agents, e.g., 
protein to be administered in which any toxic effects are outweighed by the therapeutic 
effects of the agent. Administration of a therapeutically active or therapeutically 
effective amount of an agent of the present invention is defined as an amount effective, 

1 5 at dosages and for periods of time necessary to achieve the desired result. For example, 
a therapeutically active amount of a p62 molecule can vary according to factors such as 
the disease state, age, sex, and weight of the subject, and the ability of agent to elicit a 
desired response in the subject. Dosage regimens may be adjusted to provide the 
optimum therapeutic response. For example, several divided doses may be administered 

20 daily or the dose may be proportionally reduced as indicated by the exigencies of the 
therapeutic situation. 

The agent may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal 
application, or rectal administration. Depending on the route of administration, the 

25 agent may be coated in a material to protect it from the action of enzymes, acids and 
other natural conditions which may inactivate the agent. For example, solutions or 
suspensions used for parenteral, intradermal, or subcutaneous application can include the 
following components: a sterile diluent such as water for injection, saline solution, fixed 
oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; 

30 antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as 
ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic 
acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of 
tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, 
such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be 

35 enclosed in ampules, disposable syringes or multiple dose vials made of glass or plastic. 
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To administer an agent by other than parenteral administration, it may be 
necessary to coat the agent with, or co-administer the agent with, a material to prevent 
its inactivation. For example, a p62 molecule may be administered to a subject in an 
appropriate carrier or diluent co-administered with enzyme inhibitors or in an 
5 appropriate carrier such as liposomes. Pharmaceutically acceptable diluents include 
saline and aqueous buffer solutions. Enzyme inhibitors include pancreatic trypsin 
inhibitor, diisopropylfluorophosphate (DEP) and trasylol. Liposomes include water-in- 
oil-in-water emulsions as well as conventional liposomes (Strejan et al., (1984)/ 
Neuroimmunol 7:27). Dispersions can also be prepared in glycerol, liquid polyethylene 

10 glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, 
these preparations may contain a preservative to prevent the growth of microorganisms. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the 
extemporaneous preparation of sterile injectable solutions or dispersion. In all cases, the 

1 5 composition must be sterile and must be fluid to the extent that easy syringability exists. 
It must be stable under the conditions of manufacture and storage and must be preserved 
against the contaminating action of microorganisms such as bacteria and fungi. The 
carrier can be a solvent or dispersion medium containing, for example, water, ethanol, 
polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the 

20 like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, 
by the use of a coating such as lecithin, by the maintenance of the required particle size 
in the case of dispersion and by the use of surfactants. Prevention of the action of 
microorganisms can be achieved by various antibacterial and antifungal agents, for 
example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In 

25 many cases, it will be preferable to include isotonic agents, for example, sugars, 

polyalcohols such as manitol, sorbitol, sodium chloride in the composition. Prolonged 
absorption of the injectable compositions can be brought about by including in the 
composition an agent which delays absorption, for example, aluminum monostearate 
and gelatin. 

30 Sterile injectable solutions can be prepared by incorporating the agent in the 

required amount in an appropriate solvent with one or a combination of ingredients 
enumerated above, as required, followed by filtered sterilization. Generally, dispersions 
are prepared by incorporating the agent into a sterile vehicle which contains a basic 
dispersion medium and the required other ingredients from those enumerated above. In 

35 the case of sterile powders for the preparation of sterile injectable solutions, the 

preferred methods of preparation are vacuum drying and freeze-drying which yields a 
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powder of the active ingredient (e.g., peptide) plus any additional desired ingredient 
from a previously sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They 
can be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
5 therapeutic administration, the active compound can be incorporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 
applied orally and swished and expectorated or swallowed. Pharmaceutically 
compatible binding agents, and/or adjuvant materials can be included as part of the 

10 composition. The tablets, pills, capsules, troches and the like can contain any of the 
following ingredients, or compounds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or 
lactose, a disintegrating agent such as alginic acid, Primogel, or com starch; a lubricant 
such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 

1 5 sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled 
release formulation, including implants and microencapsulated delivery systems. 

20 Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, 
polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. 
Methods for preparation of such formulations will be apparent to those skilled in the art. 
The materials can also be obtained commercially from Alza Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected 

25 cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically 
acceptable carriers. These may be prepared according to methods known to those skilled 
in the art, for example, as described in U.S. Patent No. 4,522,81 1 . 

It is especially advantageous to formulate oral or parenteral compositions in 
dosage unit form for ease of administration and uniformity of dosage. Dosage unit form 

30 as used herein refers to physically discrete units suited as unitary dosages for the subject 
to be treated; each unit containing a predetermined quantity of active compound 
calculated to produce the desired therapeutic effect in association with the required 
pharmaceutical carrier. The specification for the dosage unit forms of the invention arc 
dictated by and directly dependent on (a) the unique characteristics of the active 

35 compound and the particular therapeutic effect to be achieved, and (b) the limitations 
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inherent in the art of compounding such an active compound for the treatment of 
individuals. 

The present invention is further illustrated by the following examples which in 
no way should be construed as being further limiting. The contents of all cited 
5 references (including literature references, issued patents, published patent applications, 
and co-pending patent applications) cited throughout this application are hereby 
expressly incorporated by reference. 

EXAMPLES 

1 0 Example I : Cloning of cDN A Encoding p62 Polypeptides 

p62 was purified from cell lysate of 300 liter culture of HeLa cells using 
GST.lckSH2 conjugated glutathione agarose beads as an affinity matrix followed by 
separation on the SDS-PAGE. Two major proteins (62 kD and 160 kD; p62 and pi 60 
respectively) on the SDS-PAGE were transferred to PVDF membrane. Internal peptides 
1 5 of purified p62 were obtained by Lys-C digestion followed by reverse-phase HPLC. 
Five well resolved peptides peaks were subjected to automated Edman degradation to 
determine amino acid sequence. These five peptides had the following amino acid 
sequences: 

20 pk5, WLRK or IYIKE (SEQ ID NOs: 1 0 and 1 1 , respectively) 

pk7, LTPVSPESSSTEEK (SEQ ID NO: 12) 
pk50, NVGESVAAALSPLGI(Q)VDIDVEHGGK (SEQ ID NO: 13) 
pk55, VAALFPALRPGGFQAHYRDEDGDLVAFSSDEELTMAMSYVK (SEQ 
IDNO:14) 

25 A HeLa Uni-Zap cDNA library (Stratagene, LaJolla, CA) was then screened 

using a degenerate oligonucleotide synthesized based on the internal peptide sequence of 
pk55. One of twenty seven positive clones isolated from the library was a full length 
cDNA (2,083 bp) containing a 1 ,320 bp open reading frame. Northern Blot analysis 
performed following standard protocols using a 32 P-dCTP labelled probe derived from 

30 the p62 sequence. The mRNA sources used in the Northern analysis were (i) tissue blot 
membrane purchased from Clontech, Palo Alto, CA; and (ii) total or poly A mRNA 
purified from cultured HeLa cells, T cells (Jurkat, HPB-ALL and CEM) and B cells 
(Daudi and Raji). The Northern analysis showed that p62 is expressed ubiquitously in 
tissues observed including heart, brain, placenta, lung, liver, skeletal muscle, kidney, and 

35 pancreas and that the size of mRNA is around 2.0 kb confirming that the cDNA isolated 
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is ftill length. The deduced amino acid sequence from the cloned p62 cDN A contains 
440 amino acids including all five peptide sequences derived from protein sequencing. 

In parallel, a Daudi B cell cDNA library was screened using the same 
oligonucleotide probe. A 1,977 bp long partial cDNA was obtained and sequenced. 
5 This cDNA has 88.5% identity in amino acid sequence and 77.5% identity in nucleotide 
sequence to the cDNA isolated from the HeLa cell library. A comparison of the two p62 
nucleotide sequences is shown in Figure 6. A comparison of the two p62 amino acid 
sequences is shown in Figure 7. 

10 Example II: Cloning of cDNA Encoding p!60 Polypeptides 

pi 60 was purified from HeLa cell lysates using Lck SH2 affinity 
chromatography. The purified protein was subjected to Lys-C digestion and the 
resulting peptides were purified on HPLC. Amino acid sequences of seven well 
separated peptides were determined and are set forth below: 

15 

pk5, GSPDGSLQTGKPSAPK(S) (SEQ ID NO: 1 5) 
pk9, LRSPRGSPDGSLQTGK (SEQ ID N0.16) 
pkl4, LDVGEAMAP(Q) (SEQ ID NO: 1 7) 
pk36, EQDDTAAVLADFID (SEQ ID NO: 18) 
20 pk39, VQPEPEPEPGLLLEVEEPGTEEERGADD (SEQ ID NO: 1 9) 

pk43, VQPPPETPAEEEMETETEAEALQEKE(G)QDD(A)A(A)ML (SEQ ID 
NO:20) 

pk47, VQPEPEPEPGLLLEVEEPGT (SEQ ID NO:2I ) 

25 A HeLa cell cDNA (Stratagene, LaJolla CA) was screened with 32 P-labeled 

degenerate oligonucleotide probes synthesized based on the pk36 peptide sequence 
shown above. Positives were plaque purified and sequenced. All of the positives had 
the same sequence at the C-terminus but differed in length at the N-terminus. The 
length of the longest clone obtained was 1 ,3kb. A probe based on the N-terminal 300 

30 base pairs of the L3kb probe was used to rescreen the cDNA library. The second 

screening resulted in the isolation of an overlapping clone with an extension of 1 .9kb. 
Construction of the full length clone using internal restriction sites resulted in a 3.2kb 
clone (encoding the second pi 60 polypeptide designated herein as pi 60.2). Further 
screening of the cDNA library with a probe which included the N-terminus of the 3.2kb 

35 clone resulted in the isolation of an isoform of pl60 which was 3.9kb in length 
(designated herein as pi 60.1). 
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Example III: Biochemical Characterization of p62 

The following materials and methods were used throughout this Example: 

5 Cell culture, transfection, and metabolic labeling 

HeLa and CD4 + HeLa cells (Shin, J. et al. ( 1 990) EMBO J. 9:425-434) and Jurkat 
T cells were,maintained in 10% fetal bovine serum supplemented DMEM and RPMI 
respectively. For v-src expression, HeLa cells were transiently transfected with 20 mg 
of cDNA per 10 cm plate using the calcium phosphate precipitation method (Chen, C. et 
10 al (1987) Mol Cell Biol 7:2745-2752). For metabolic labeling, cells were incubated 
with 100 mCi/ml 35 S-methionine in methionine free DMEM for one hour. 

Site directed mutagenesis. GST fusion protein production, and protein precipitation 
Site-directed mutagenesis was performed on uracil-containing phage DNA 

1 5 (Kunkel, T. ( 1 985) Proc. Natl Acad. Sci USA 82:488-492) using the M 1 3 Muta-Gene 
kit (Bio-Rad). GST fusion proteins were produced as described elsewhere (Joung, I. et 
al. (1995) Proc. Natl Acad ScL USA 92:5778-5782; Payne, G. et al. (1993) Proc. Natl 
Acad. Sci. USA 90:4902-4906). HeLa cell lysate was prepared and used for GST fusion 
protein binding as described (Joung, I. et al. (1995) Proc. Natl Acad. Sci USA 92:5778- 

20 5782). Phosphatase inhibitors were added as indicated in the Brief Description of the 
Drawings section. For the competition assay, the stated amounts of phosphotyrosyl 
peptides were added to the lysates during incubation. After washing three times with 
lysis buffer, bound proteins were eluted by boiling in SDS-PAGE loading buffer. After 
SDS-PAGE, 35 S-methionine labeled proteins on the gel were fluorographed, dried, and 

25 visualized by autoradiography. For Western analysis, proteins were electrotransferred to 
nitrocellulose and immunoblotted using 4G10 monoclonal antibody and HRP- 
conjugated Goat anti-Mouse antibody. Signals were developed using enhanced 
chemiluminescence (Amersham). 

30 Results of Biochemical Characterization of p62: 

A. p62 binds to the SH2 domain in a phosphotyrosine-independent manner 

GST and GST fusion proteins of p56' c ^ subdomains (Figure 1 2A) containing 
unique N-terminal region (1-77), unique N-terminal region and SH3 domain (1-123), 
35 and SH2 domain (1 19-224) were incubated with lysates from 35 S-methionine labelled 
CD4* HeLa cells. Bound proteins were separated on 9% SDS-PAGE, fluorographed. 
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and detected by autoradiography. Each subdomain of p56 Ick can specifically bind to 
proteins from this HeLa cell lysate (Figure 12B). In Figure 12B, a 62 kD protein (p62) 
that bound specifically to the SH2 domain is marked with an arrow. GST 1 19-224 (the 
SH2 domain alone) uniquely precipitated a 62 kD protein (p62) that was not precipitated 

5 by any of the other proteins (Figure 12B). The binding of p62 to the p56 lck SH2 
domain was also observed in cell lysate of non-activated Jurkat T cells. 

35 S-methionine labelled HeLa cells were lysed in the presence or absence of 
phosphatase inhibitors (sodium vanadate (NaVO^ and sodium fluoride (NaF)), protease 
inhibitors (PMSF and Leupeptin), or reducing reagent (DTT). The lysates were 

1 0 incubated with GST. 1 1 9-224, and bound proteins were analyzed by SDS-PAGE. p62 
could not be detected by immunoblotting using 4G1 0 anti-phosphotyrosine antibody 
(see Figure 15). Furthermore, p62 binding to the SH2 domain was enhanced in cell 
lysates prepared in the absence of phosphatase inhibitors, NaV04 and NaF, while the 
binding was insensitive to the lack of protease inhibitors and reducing reagents (Figure 

15 12C). These data suggest that p62 binding to the p56 ,ck SH2 domain is 
phosphotyrosine (pY)-independent. 

B. p62 binds to a specific site other than the phosphotvrosi ne-dependent binding 
site of the SH2 domain. 

20 35 S-methionine labelled HeLa cells were lysed in the presence of phosphatase 

inhibitors (NaV(>4 and NaF). The lysates were incubated with increasing concentrations 
of phosphotyrosyl peptides; pY324, pY505, pY77L and pY536. Bound p62 was 
separated on 9 % SDS-PAGE, fluorographed, and detected by autoradiography. 

Two phosphotyrosyl peptides, pY324 and pY505 (derived from polyoma middle 

25 T antigen (EPQpYEEIPI YL) and from the C-terminal negative regulatory region of 
p56 lck (TEGQpYQPQPA) respectively) bind strongly and specifically to the p56 lck 
SH2 domain (Payne, G. et al. (1993) Proc, Natl. Acad. ScL USA 90:4902-4906). These 
two specific peptides competed away p62 binding to GST.l 19-224 at 1 mM and 15 mM 
of pY324 and pY505 peptides respectively (Figure 13). Phosphotyrosyl peptides that 

30 bind poorly (pY77 1 (SSNpYMAPYDNY) and pY536 (ESEpYGNITYPP)), however, 
did not affect p62 binding to GST. 1 19-224. Thus, pY- independent binding of p62 to the 
p56 lck SH2 domain is interrupted by binding of the phosphotyrosyl peptide to the SH2 
domain. 

An arginine residue (Argl54 of p56 lck ) that is conserved in all SH2 domains and 
35 is a part of the pY binding pocket (Mayer, B. et al. (1992) Mol Cell Biol. 12:609-618; 
Eck, M. et al. (1993) Nature 362:87-91) was mutated to lysine (GST.l 19-224 .R154K). 
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Specifically, GST alone, GST.l 19-224, and GST.l 19-224.RI54K were incubated with 
v-src transfected HeLa cell lysate in the presence of phosphatase inhibitors. Bound 
proteins were analyzed by immunoblotting with anti-phosphotyrosine antibody (Figure 
14A). GST alone, GST.l 19-224, and GST. 1 1 9-224.R 1 54K were incubated with 35 S- 
5 methionine labeled HeLa cell lysate in the presence of phosphatase inhibitors. 
Competition of p62 binding to the SH2 domain by phosphotyrosyl peptide was 
measured by adding 10 mM pY324 peptide to the incubation mixture. Bound proteins 
were analyzed by SDS-PAGE. The mutant did not bind to phosphotyrosyl proteins 
(Figure 14A). The binding of p62, however, was unaltered in the GST.l I9-224.R154K 
10 protein and was not inhibited by high concentration of pY324 (Figure 14B). These data 
suggest that p62 binds to a specific site other than the pY-dependent binding site of the 
SH2 domain. 

C. phosphotvrosine-independent binding of p62 to the p56^ SH2 domain is also 

15 regulated by phosphorylation of Ser59 of p56 ^ 

The Ser59 phosphorylation site in the unique N-terminal region affects the 
binding affinity and specificity of the SH2 domain of p56^ c ^ for phosphotyrosyl proteins 
(Joung, L et ah (1995) Proc. Natl. Acad. Sci. USA 92:5778-5782; Winkler, D. et ah 
(1993) Proc. Natl. Acad Sci. USA 90:5176-5180). The effect of the Ser59 

20 phosphorylation site on p62 binding to the p56' c ^ SH2 domain was therefore examined 
by comparing protein binding to GST.l 19-224 and to GST.53-224 which contains the 
Ser59 phosphorylation site (amino acid residues 53 to 64). HeLa cells transfected with 
v-src or vector alone were labelled with ^ 5 S-methionine and lysed in the presence or 
absence of phosphatase inhibitors. Samples that were lysed in the absence of 

25 phosphatase inhibitors were treated with exogenous recombinant phosphatase mixture 
(recombinant catalytic fragments of the tyrosine phosphatases LAR, CD45, and SHPTP- 
1). The lysates were incubated with GST alone, GST.l 19-224, and GST.53-224. Bound 
proteins were separated on 8% SDS-PAGE, electrotransferred to nitrocellulose, and 
detected by autoradiography (Figure 1 5A). In Figure 15B, the same membrane in 

30 Figure 15A was immunoblotted with anti-phosphotyrosine antibody (4G10). p62 and 
two phosphotyrosyl proteins (pp70 and pp80) are marked. As expected, GST.l 19-224 
precipitated a unique set of phosphotyrosyl proteins (ppl30 and pp80) from v-src 
transfected cell lysate in the presence of phosphatase inhibitors, while GST.53-224 
precipitated phosphotyrosyl proteins pp70 as well as ppl 30 and pp80 (Joung, I. et al. 

35 (1995) Proc. Natl. Acad. Sci. USA 92:5778-5782). However, in the absence of 
phosphatase inhibitors, GST.l 19-224, but not GST.53-224 or GST alone, strongly 
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bound to 35 S -labeled p62 in both v-src transfected and untransfected cell lysates (Figure 
15 A). 

HeLa cells were labelled with -^-methionine, lysed in the absence of 
phosphatase inhibitors, incubated with GST alone, GST.l 19-224, GST.65-224, and 
5 GST.53-224.S59E. Bound proteins were separated on 9% SDS-PAGE, fluorographed, 
and detected by autoradiography (Figure 1 5C). Binding of the SH2 domain in GST. 53- 
224 to p62 was restored by truncation of the unique N -terminal region (using GST.65- 
224 which contains SH3 and SH2 domains only) or by mutation of Ser59 to Glu59 of 
the protein (using GST.53-224.S59E) (Figure 15C and compare to Figure 15A). These 
1 0 data suggest that the pY-independent binding of p62 to the p56^ c ' c SH2 domain is also 
regulated by phosphorylation of Ser59, for which the S59E mutation is a substitution. 

D. p62 is a novel protein and also binds to pi 20 ras-GAP 

A protein of the same molecular weight as p62 (62 kD) was precipitated by an 

1 5 antiserum raised against pi 20 ras-GAP but not by control rabbit serum (Figure 1 6A) or 
by antibodies against PI-3 kinase, MAP kinase, CD4, or PLC-g. 35 S-methionine 
labelled HeLa cells were lysed in the presence or absence of phosphatase inhibitors. The 
lysates were incubated with GST alone or with GST.l 19-224. Alternatively, the lysates 
were immunoprecipitated with anti-GAP antibody or with a preimmune serum. Bound 

20 proteins were separated on 9% SDS-PAGE, fluorographed, and detected by 

autoradiography (Figures 16B and 16C). Recombinant p62 GAP binding protein 
(rp62 GAPb P) was run on SDS-PAGE along with GST.l 19-224 and ras-GAP binding 
proteins of Figure 16A. Proteins were detected both by autoradiography (Figure 16B) 
and by Coomassie blue staining (Figure 16C). The prominent bands in Figure 16C are 

25 rp62^APbp (j^g \^ antibody (lane 2), and fusion protein (lane 3). The 62 kD protein 
was precipitated by two different anti-ras-GAP antibodies, indicating that the association 
between the 62 kD protein and ras-GAP may be a specific interaction. 35 S-methionine 
labelled p62 protein bands from Figure 16B were excised and partially digested in the 
second dimensional 15% SDS-PAGE. V8 protease digestion of the 62 kD proteins 

30 precipitated by GST.l 19-224 and anti-GAP antibody produced identical cleavage 

patterns (Figure 16D), indicating that p62 can bind to both the p56' c ^ SH2 domain and 
ras-GAP. 

A "62 kD to 68 kD" phosphotyrosyl-protein has been recognized as a pY 
dependent ras-GAP SH2 domain binding protein (p62^APbp) m d its cDNA has been 
35 cloned (Wong, G. et al. (1992) Cell 69:551-558). However, recombinant p62 GAPb P 
runs slower than p62 on SDS-PAGE, and in this gel is closer to 68 kD (Figure 1 6B and 
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16C). p62 was purified from a 200 liter HeLa cell culture using GST. 1 19-224 affinity 
column, separated on 8% SDS-PAGE, electrotransferred to PVDF membrane, and the 
p62 band was cut from the blot. The p62 was digested with Lys-C. Furthermore, the 
amino acid sequence of an internal peptide of purified p62 (Figure 16E) does not match 
5 p62^APbp or any other known protein sequence in the data base. Thus, p62 is a novel 
protein and is different from the previously characterized pp62^^^P. 

E. p62 associates with Ser/Thr protein kinase activity 

Protein kinase activity as a potential role of proteins that bind to the p56' c ^ SH2 

10 domain in a pY-independent manner was examined. 35 S-methionine labelled HeLa cells 
were lysed in the presence or absence of phosphatase inhibitors and competing peptide 
pY324. The lysates were incubated with GST alone or with GST. 1 19-224. Bound 
proteins were separated on 9% SDS-PAGE, fluorographed, and detected by 
autoradiography (lanes 2, 4, 6, and 8). Kinase activity was also measured by incubating 

1 5 the bound proteins with kinase buffer and 32 P-g-ATP (lanes 1,3,5, and 7). In addition 
to p62, three additional discrete 35 S-labeled protein bands including pi 60, and two high 
molecular weight protein bands were sometimes observed in HeLa cell lysate as p56' c ^ 
SH2 domain binding proteins (Figure 17A, lane 6). When 32 PATP and kinase reaction 
buffer were added, the protein complex containing the p56' c ^ SH2 domain and the 

20 bound proteins induced phosphorylation of p62, pi 60, and a few other binding proteins 
including a 100 kD common GST binding protein (lane 5). This phosphorylation event 
was observed neither in the GST-protein complex (lanes 1 and 3) nor in the GST.SH2- 
protein complex formed in the presence of NaVC>4 and pY324 (lane 7). This kinase 
activity can also use myelin basic protein (MBP) as an exogenous substrate (Figure 17B) 

25 and the kinase activity can be eluted from the protein complex by NaVC>4 and pY324 
(Figure 1 7C). Sample aliquots of Figure 17A, lanes 2, 4, 6, and 8 were incubated with 
kinase buffer, 32 P-g-ATP, and myelin basic protein (MBP) as exogenous substrate. MBP 
was separated on 12 % SDS-PAGE, and its phosphorylation was visualized by 
autoradiography. In Figure 17C, MBP kinase activity (lane 1) was sequentially eluted 

30 with competing pY324 peptide (lane 2) and then with glutathione (lane 3) from 
glutathione-agarose bound to GST. 1 1 9-224 and its associated proteins (part of the 
sample shown in Figure 1 7A lane 6 was used). 

Phospho-amino acid analysis of phosphorylated MBP of Figure 17B produced 
mostly phosphoserine and some phosphothreonine (Figure 1 7D). The same 

35 phosphoamino acid composition was found for endogenous substrates such as p35, p62. 
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pl 10, and pi 60 of Figure 17 A, lane 5. These results suggest that one of the pY- 
independent proteins binding to the p56^ c ' c SH2 domain is a ser/thr kinase. 

The GST.SH2-protein complex (the same as Figure 17A, lane 5) was separated 
on SDS-PAGE that was polymerized in the presence of MBP. Proteins on the gel were 
5 renatured and the location of kinase activity was measured (Figure 17E and Tobe, K. et 
al. (1992) J. Biol Chem. 267:21089-21097). For a positive control, 0.5 mg of purified 
p44.erkl (UBI) was used (lane 5). A sample of an in vitro kinase assay as described in 
Figure 17A, lane 5, was separately run on a SDS-PAGE (lane 6) and compared with in- 
gel kinase assay. Neither GST itself nor GST-SH2 in the presence of NaV04 and 

10 pY324 brought down any MBP kinase activity. However, GST-SH2, in the absence of 
NaV(>4 and the competing peptide, associated with an MBP kinase activity with 
migration the same as p62. Thus p62 itself or a protein with similar molecular weight 
appears to be a Ser/Thr protein kinase, indicative of its potential role in a kinase cascade 
distinct from pathways initiated by binding of pY-proteins. 

1 5 The p Y-independent binding of proteins to the p56' c ' c SH2 domain suggests 

another class of protein-protein interactions mediated by SH2 domains. However, p62 
interaction with the p56 ,c ' c SH2 domain does not appear to require serine 
phosphorylation, as evidenced by reduced binding in the presence of phosphatase 
inhibitors (Figure 12C). 

20 The binding of the SH2 domain, a small module composed of about 1 00 amino 

acids (Pawson, T. et al. (1993) Current Biology 3:434-442), to proteins in two different 
ways requires efficient use of the accessible surface. Competition between p62 and 
specific phosphotyrosyl-peptide binding to the p56' c ^ SH2 domain (Figure 13) indicates 
that occupation of one of these protein binding sites excludes binding to the other site. 

25 Possible mechanisms for this exclusion include (i) the use of a single binding site or two 
adjacent sites for these two types of protein interaction resulting in steric hindrance 
induced by the binding of one ligand, or (ii) the allosteric alteration of one site by the 
occupation of the other. Although the possibility of a single binding site has not been 
excluded, the observation that GST.53-224 binds tightly to phosphotyrosyl proteins but 

30 not to p62 (Figures 1 5A-1 5C) indicates that pY-independent binding may use a site 

other than the pY binding pocket. Successful binding of GST.SH2.R1 54K, which has a 
dysfunctional pY binding pocket, to p62 (Figures 14A-14B) suggests that these two 
binding modes of the SH2 domain have different binding mechanisms if not separate 
binding sites. In any case, competition between phosphotyrosyl peptides and p62 for the 

35 p56 lck SH2 domain permits only one of these two binding sites to be used at any given 
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time, thus allowing the maintenance of two separate binding sites on such a small 
domain. 

The C-terminal pTyr505 suppresses the catalytic activity through intramolecular 
interaction with the SH2 domain of p56^ (Cooper, J. et al. (1993) Cell 73:1051-1054; 

5 Chan, A. et al. (1994) Annu. Rev. Immunol. 12:555-592). During T cell activation, the 
C-terminal Tyr505 is dephosphorylated, freeing the pY binding pocket of the SH2 
domain, and Ser59 undergoes transient phosphorylation following the activation of 
MAP kinase. Since the binding of p62 to the p56 lck SH2 domain is sensitive both to 
Ser59 phosphorylation (Figures 15A-15C) and to phosphotyrosyl peptide binding 

10 (Figure 1 3), interaction of p62 and SH2 domain in full length p56 lck would be likely to 
occur at the time when Tyr505 is dephosphorylated and Ser59 is phosphorylated. Since 
MAP kinase activation precedes Ser59 phosphorylation, the pY-independent binding of 
the p56 lck SH2 domain may be involved in regulation of later stages of signal 
transduction. 

15 

F. d62 is localized to the cytoplasm and binds to lck S H2 domain in a 
phosphotvrosine-independent manner 

Immunofluorescence staining of p62 in HeLa cells showed that p62 is 
mostly, if not exclusively, localized to the cytoplasm. Expression of T7-epitope 

20 tagged p62 and its deletion mutants of p62 followed by GST-SH2 binding assay 
shows that (i) the binding is stronger in the absence of NaV0 4 as expected and (ii) 
binding site for the lck SH2 domain is located in the N-terminal 50 amino acids. A 
tyrosine residue (Tyr 9) present in the N-terminal 50 amino acids can be mutated to 
phenylalanine without any change in binding to the lck SH2 domain. Thus, p62 

25 indeed binds the lck SH2 domain in a phosphotyrosine-independent manner. 

In addition, T7-epitope specific immunoprecipitation of p62 pulled down 
the same MBP Ser/Thr kinase activity which has been seen in p62-lck.SH2 
complex. Furthermore, transient expression of p62 augmented PMA/Ionomycin 
induced gene activation of NF-AT transcription factor and IL-2 20 and 5 fold, 

30 respectively, in Jurkat T cells. These results suggest that the cloned cDNA indeed 
encodes p62 protein and its binding mechanism to the lck.SH2 domain is unique 
and significant in T cell signaling. 

G. p62 can arrest cell cvcle progression 

35 When p62 was transiently expressed in p62 positive HeLa cells, the cells stopped 

their cell cycle progression at the Gl/S boundary as shown by DNA content analysis. 
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This result was confirmed by biochemical analysis. p62 overexpressing HeLa cells were 
found only in interphase while cells which were not transfected were found in all stages 
of cell cycle including M phase. 

5 H. p62 binds directly and noncovalentlv to ubiquitin 

Potential binding proteins for p62 have been sought using p62 as a bait in the 
GAL4-fusion based yeast two hybrid system. Forty-six truly positive clones were 
obtained and twenty-six of them were initially analyzed. Twenty -three of the twenty-six 
positive clones contained the human ubiquitin gene fused to the GAL4-activation 

10 domain. Furthermore, ubiquitin-conjugated Sepharose bead (Ub-Spharose) but not 
sepharose bead itself precipitated p62 from HeLa cell lysate, and this ubiquitin-p62 
interaction was competed by excess soluble ubiquitin in reaction mixture. However, 
unlike enzymes for the ubiquitin conjugation process such as El , E2, and E3, ubiquitin 
and p62 do not require ATP and DTT for association and dissociation respectively. In 

15 addition, the ubiquitin binding region of p62 has been mapped in the C-terminal 150 
amino acids. These results suggest that p62 directly and noncovalently binds to 
ubiquitin and thus that a physiological role of p62 is coupled to the ubiquitination- 
mediated specific protein degradation. 

20 L p62 overexpression in HeLa cells stabilizes the tumor suppressor p53 

Ubiquitination followed by rapid destruction of cyclins, the mitotic inhibitor p27, 
and the tumor suppressor p53 have been recently recognized as major cell cycle 
regulation mechanisms. Particularly, in HeLa cells which were transformed by 
papilloma virus type 1 8, viral E6 protein induced rapid degradation of p53 via activation 

25 of a E6-AP ubiquitin ligase. Destabilization of p53 resulted in suppressed expression of 
cdk inhibitor p21 ci P, thus resulting in tumorigenesis. 

Overexpression of p62 in HeLa cells substantially stabilized p53 and induced 
increased expression level of p21 ci P. However, expression levels of Gl/S cyclins (D and 
E) were not affected by p62 overexpression. In in vitro analysis, p53 was rapidly 

30 degraded upon addition of E6 to rabbit reticulocyte lysate. Addition of p62 to this 
reaction prevented p53 from rapid degradation. Furthermore, p62 prevents the 
formation of E6 dependent ubiquitin-p53 conjugates. These results suggest that cell 
cycle arrest observed in p62 overexpressing HeLa cells is at least partly due to a 
reactivated p53-p21 ci P cell cycle surveillance system, and that p62 regulates the stability 

35 of p53 by blocking the E6-induced ubiquitination. 
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J. p62 (from HeLa cells) modification is dependent on the cell cycle 

When HeLa cells were arrested at M-phase by nocodazol treatment, 1 00% of 
p62H undergo apparent modification(s) as shown by its gel mobility changes either 
migrating as 64 kD or as 65 kD size. This modification is not an artifactual modification 
5 by the nocodazol treatment because mitotic cells that were released from hydroxylurea- 
induced Gl/S blockage showed the same modification. Furthermore, when the mitotic 
cells entered Gl phase, p62 regained its mobility on the SDS-PAGE as 62 kD. 
Additional experiments with more defined time intervals confirmed that the p62 
modification occurred only during M-phase. 

10 A few proteins change their mobility on SDS-PAGE upon Ser/Thr 

phosphorylation(s) of proline-directed kinase substrate site(s). Interestingly, p62 has 
several such phosphorylation sites. In many cases, this type of modification serves as a 
critical regulatory element for the function of target protein. Thus, it is expected that 
p62 may also have a role in cell division process in addition to a regulatory role in 

1 5 interphase event, and that its function is tightly regulated. 

K. p62 gene family members have distinct roles/mechanisms of action 

Stable overexpression of p62 in a leukemic T cell line Jurkat has been 
successftilly established. Unlike epithelial cells and fibroblasts (exemplified in HeLa 

20 and NIH3T3 cells), Jurkat cells that overexpress p62 maintain their proliferation as 

compared to untransfected Jurkat cells. In two independent parallel experiments using 
Jurkat cells and the p56 ick negative mutant cell line J.Cam.l .6, only Jurkat cell lines 
overexpressing p62 were obtained. No J.CamT.6 cell lines overexpressing p62 were 
obtained. As p62 was originally identified as a cellular ligand for the SH2 domain of 

25 p56 lck , it is possible that lack of p56 lck may be critical in resistance to p62 

overexpression not only in fibroblast and epithelial cells but also in T cells. This result 
also indicates that T cells may have a distinct mechanism(s) which can be compatible 
with p56 lck for cell cycle regulation regarding p62 function. As described, the presence 
of hematopoietic lineage specific isoform(s) of p62 may partly account for this 

30 discrepancy. 

In addition to some key proteins in cell cycle machinery, components of 
mitogenic transcription factors such as NFkB, IkB, c-jun, and c-fos are also regulated by 
ubiquitination mediated degradation initiated by external signals. Transient expression 
of p62 augmented PMA/Ca^ induced activation of IL-2 gene in Jurkat T cells. As the 

35 IL-2 promoter contains binding sites for NF-kB and AP-1 , it is possible that, in a T cell 
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environment, overexpression of p62 may affect the fate of some of these transcription 
factors upon PMA/Ca ++ signals and lead to augmented activation of the IL-2 gene. 

In conclusion, based on the results described herein, p62 can be described as a 
protein (i) that binds to the p56 ,ck SH2 domain and thus is likely to be involved in 
5 initiation of signal mediating process upon external stimulus; (ii) that binds to ubiquitin 
and is involved in ubiquitin-mediated specific protein degradation at the downstream of 
the signal transduction; (iii) that binds to and uses a Ser/Thr kinase and the pi 25 ras- 
GAP as signal mediators; (iv) that contains regulatory features in itself for tight control 
of its functions; and (v) that is expressed as a tissue specific isoform in order to maintain 
1 0 its functional compatibility or to be used in distinct functions. 

M-phase specific modification of p62 as well as its ability to bind to ubiquitin, to 
bind the p56 ,ck SH2 domain, to bind to a Ser/Thr kinase, and to bind pi 20 ras-GAP 
strongly suggest that p62 would be the first identified protein having such a regulated 
ubiquitination process. 

15 

Example IV: Production of Anti-p62 Antibody 

A 17-mer synthetic peptide (comprising amino acids Ser407 to Asp423 of the 
amino acid sequence of Figure 2, SEQ ID NO:2 and encoded by nucleotides 1285 to 
20 1335 of the nucleotide sequence of Figure 1 , SEQ ID NO: 1 ) was generated. This 

peptide was used as an immunogen in two rabbits. Polyclonal antisera against the 1 7- 
mer peptide was then isolated. 

Example V: Modification of p62 Polypeptide Domains and Effects of 
25 Modification on p62 Activity 



Site-directed mutagenesis was performed on uracil-containing phage DN A 
(Kunkel, T. ( 1 985) Proc. Natl Acad. Sci USA 82:488-492) using the M 1 3 Muta-Gene 
kit (Bio-Rad). The results of the mutagenesis are shown in Table I below. 
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TABLE I 



Deletion Sites 
amino acids 
(nucleic acids) 


SH2 Binding 


Ubiquitin 

Rinrlinn 

Dintiing 


Inhibition of 
Ubiquitination 


Inhibition of 
Degradation 


Wild type (no 
deletion) 


4- 


• 


i 

-r 


+ 


Tyr9 to Ser28 
(t9l to cl50) 




nd 


nd 


nd 


Pro29 to Arg50 
(cl51 tog216) 


- 


nd 


nd 


nd 


Met! to Arg50 
(a67tog216) 


— 


nd 


nd 


nd 


Metl to Lysl 87 
(a67 to g627 




+ 


nd 


nd 


Asp258 to 
Leu440 

(t840togl386) 


+ 




nd 


nd 


Glu32 to 
Pro322 

(g 160 tot 1032) 


nd 




nd 


nd 


Metl to Lys295 
(a67tog951) 


nd 


+ 


+ 


+ 



Equivalents 

5 Those skilled in the art will recognize, or be able to ascertain using no more than 

routine experimentation, many equivalents of the specific embodiments of the invention 
described herein. Such equivalents are intended to be encompassed by the following 
claims. 



WO 97/22255 



PCT/US96/19944 



-70- 

SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Jaekyoon Shin, Insil Joung, Ratna K. Vadlamudi 
and Jack L- Strominger 



(ii) TITLE OF INVENTION: p62 POLYPEPTIDES , RELATED POLYPEPTIDES 

AND USES THEREFOR 

(iii) NUMBER OF SEQUENCES: 22 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAHIVE & COCKFIELD 

(B) STREET: 6 0 State Street 

(C) CITY; Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02109-1875 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC - DOS /MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/574,959 

(B) FILING DATE: 19-DEC-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Mandragouras , Amy E. 

(B) REGISTRATION NUMBER: 36,207 

(C) REFERENCE / DOCKET NUMBER: DFN-008 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617)227-7400 

(B) TELEFAX: (617)227-5941 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2083 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 
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(B) LOCATION: 67.. 1390 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GAATTCGGCA CGAGGCGCGG CGGCTGCGAC CGGGACGGCC CATTTTCCGC CAGCTCGCCO 
60 

CTCGCT ATG GCG TCG CTC ACC GTG AAG GCC TAC CTT CTG GGC AAG GAG 
108 

Met Ala Ser Leu Thr Val Lys Ala Tyr Leu Leu Gly Lys Glu 
15 10 

GAC GCG GCG CGC GAG ATT CGC CGC TTC AGC TTC TGC TGC AGC CCC GAG 
156 

Asp Ala Ala Arg Glu lie Arg Arg Phe Ser Phe Cys Cys Ser Pro Glu 
15 20 25 30 

CCT GAG GCG GAA GCC GAG GCT GCG GCG GGT CCG GGA CCC TGC GAG CGG 
204 

Pro Glu Ala Glu Ala Glu Ala Ala Ala Gly Pro Gly Pro Cys Glu Arg 
35 40 45 

CTG CTG AGC CGG GTG GCC GCC CTG TTC CCC GCG CTG CGG CCT GGC GGC 
252 

Leu Leu Ser Arg Val Ala Ala Leu Phe Pro Ala Leu Arg Pro Gly Gly 
50 55 60 

TTC CAG GCG CAC TAC CGC GAT GAG GAC GGG GAC TTG GTT GCC TTT TCC 
300 

Phe Gin Ala His Tyr Arg Asp Glu Asp Gly Asp Leu Val Ala Phe Ser 
65 70 75 

AGT GAC GAG GAA TTG ACA ATG GCC ATG TCC TAC GTG AAG GAT GAC ATC 
348 

Ser Asp Glu Glu Leu Thr Met Ala Met Ser Tyr Val Lys Asp Asp lie 
80 85 90 

TTC CGA ATC TAC ATT AAA GAG AAA AAA GAG TGC CGG CGG GAC CAC CGC 
396 

Phe Arg lie Tyr lie Lys Glu Lys Lys Glu Cys Arg Arg Asp His Arg 
95 100 105 110 

CCA CCG TGT GCT CAG GAG GCG CCC CGC AAC ATG GTG CAC CCC AAT GTG 
444 

Pro Pro Cys Ala Gin Glu Ala Pro Arg Asn Met Val His Pro Asn Val 
115 120 125 

ATC TGC GAT GGC TGC AAT GGG CCT GTG GTA GGA ACC CGC TAC AAG TGC 
492 

lie Cys Asp Gly Cys Asn Gly Pro Val Val Gly Thr Arg Tyr Lys Cys 
130 135 140 

AGC GTC TGC CCA GAC TAC GAC TTG TGT AGC GTC TGC GAG GGA AAG GGC 
540 
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Ser Val Cys Pro Asp Tyr Asp Leu Cys Ser Val Cys Glu Gly Lys Gly 
145 150 155 

TTG CAC CGG GGG CAC ACC AAG CTC GCA TTC CCC AGC CCC TTC GGG CAC 
588 

Leu His Arg Gly His Thr Lys Leu Ala Phe Pro Ser Pro Phe Gly His 
160 165 170 

CTG TCT GAG GGC TTC TCG CAC AGC CGC TGG CTC CGG AAG GTG AAA CAC 
636 

Leu Ser Glu Gly Phe Ser His Ser Arg Trp Leu Arg Lys Val Lys His 

175 180 185 190 

GGA CAC TTC GGG TGG CCA GGA TGG GAA ATG GGT CCA CCA GGA AAC TGG 
684 

Gly His Phe Gly Trp Pro Gly Trp Glu Met Gly Pro Pro Gly Asn Trp 
195 200 205 

AGC CCA CGT CCT CCT CGT GCA GGG GAG GCC CGC CCT GGC CCC ACG GCA 
732 

Ser Pro Arg Pro Pro Arg Ala Gly Glu Ala Arg Pro Gly Pro Thr Ala 
210 215 220 

GAA TCA GCT TCT GGT CCA TCG GAG GAT CCG AGT GTG AAT TTC CTG AAG 
780 

Glu Ser Ala Ser Gly Pro Ser Glu Asp Pro Ser Val Asn Phe Leu Lys 
225 230 235 

AAC GTT GGG GAG AGT GTG GCA GCT GCC CTT AGC CCT CTG GGC ATT GAA 
828 

Asn Val Gly Glu Ser Val Ala Ala Ala Leu Ser Pro Leu Gly lie Glu 
240 245 250 

GTT GAT ATC GAT GTG GAG CAC GGA GGG AAA AGA AGC CGC CTG ACC CCC 
876 

Val Asp lie Asp Val Glu His Gly Gly Lys Arg Ser Arg Leu Thr Pro 

255 260 265 270 

GTC TCT CCA GAG AGT TCC AGC ACA GAG GAG AAG AGC AGC TCA CAG CCA 
924 

Val Ser Pro Glu Ser Ser Ser Thr Glu Glu Lys Ser Ser Ser Gin Pro 
275 280 285 

AGC AGC TGC TGC TCT GAC CCC AGC AAG CCG GGT GGG AAT GTT GAG GGC 
972 

Ser Ser Cys Cys Ser Asp Pro Ser Lys Pro Gly Gly Asn Val Glu Gly 
290 295 300 

GCC ACG CAG TCT CTG GCG GAG CAG ATG AGG AAG ATC GCC TTG GAG TCC 
1020 

Ala Thr Gin Ser Leu Ala Glu Gin Met Arg Lys He Ala Leu Glu Ser 
305 310 315 

GAG GGG CGC CCT GAG GAA CAG ATG GAG TCG GAT AAC TGT TCA GGA GGA 
1068 
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Glu Gly Arg Pro Glu Glu Gin Met Glu Ser Asp Asn Cys Ser Gly Gly 
320 325 330 

GAT GAT GAC TGG ACC CAT CTG TCT TCA AAA GAA GTG GAC CCG TCT ACA 
1116 

Asp Asp Asp Trp Thr His Leu Ser Ser Lys Glu Val Asp Pro Ser Thr 
335 340 345 350 

GGT GAA CTC CAG TCC CTA CAG ATG CCA GAA TCC GAA GGG CCA AGC TCT 
1164 

Gly Glu Leu Gin Ser Leu Gin Met Pro Glu Ser Glu Gly Pro Ser Ser 
355 360 365 

CTG GAC CCC TCC CAG GAG GGA CCC ACA GGG CTG AAG GAA GCT GCC TTG 
1212 

Leu Asp Pro Ser Gin Glu Gly Pro Thr Gly Leu Lys Glu Ala Ala Leu 
370 375 380 

TAC CCA CAT CTA CCG CCA GAG GCT GAC CCG CGG CTG ATT GAG TCC CTC 
1260 

Tyr Pro His Leu Pro Pro Glu Ala Asp Pro Arg Leu lie Glu Ser Leu 
385 390 395 

TCC CAG ATG CTG TCC ATG GGC TTC TCT GAT GAA GGC GGC TGG CTC ACC 
1308 

Ser Gin Met Leu Ser Met Gly Phe Ser Asp Glu Gly Gly Trp Leu Thr 
400 405 410 

AGG CTC CTG CAG ACC AAG AAC TAT GAC ATC GGA GCG GCT CTG GAC ACC 
1356 

Arg Leu Leu Gin Thr Lys Asn Tyr Asp lie Gly Ala Ala Leu Asp Thr 
415 420 425 430 

ATC CAG TAT TCA AAG CAT CCC CCG CCG TTG TGA C CACTTTTGCC 
1400 

lie Gin Tyr Ser Lys His Pro Pro Pro Leu * 
435 440 

CACCTCTTCT GCGTGCCCCT CTTCTGTCTC ATAGTTGTGT TAAGCTTGCG TAGAATTGCA 
1460 

GGTCTCTGTA CGGGCCAGTT TCTCTGCCTT CTTCCAGGAT CAGGGGTTAG GGTGCAAGAA 
1520 

GCCATTTAGG GCAGCAAAAC AAGTGACATG AAGGGAGGGT CCCTGTGTGT GTGTGTGCTG 
1580 

ATGTTTCCTG GGTGCCCTGG CTCCTTGCAG CAGGGCTGGG CCTGCGAGAC CCAAGGCTCA 
1640 

CTGCAGCGCG CTCCTGACCC CTCCCTGCAG GGGCTACGTT AGCAGCCCAG CACATAGCTT 
1700 

GCCTAATGGC TTTCACTTTC TCTTTTGTTT TAAATGACTC ATAGGTCCCT GACATTTAGT 
1760 
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TGATTATTTT CTGCTACAGA CCTGGTACAC TCTGATTTTA GATAAAGTAA GCCTAGGTGT 
1820 

TGTCAGCAGG CAGGCTGGGG AGGCCAGTGT TGTGGGCTTC CTGCTGGGAC TGAGAAGGCT 
1880 

CACGAAGGGC ATCCGCAATG TTGGTTTCAC TGAGAGCTGC CTCCTGGTCT CTTCACCACT 
1940 

GTAGTTCTCT CATTTCCAAA CCATCAGCTG CTTTTAAAAT AAGATCTCTT TGTAGCCATC 
2000 

CTGTTAAATT TGTAAACAAT CTAATTAAAT GGCATCAGCA CTTTAACCAA TAAAAAAAAA 
2060 

AAAAAAAAAA AAAACTCGAG GGA 
2083 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 44 0 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Ser Leu Thr Val Lys Ala Tyr Leu Leu Gly Lys Glu Asp Ala 
15 10 15 

Ala Arg Glu He Arg Arg Phe Ser Phe Cys Cys Ser Pro Glu Pro Glu 
20 25 30 

Ala Glu Ala Glu Ala Ala Ala Gly Pro Gly Pro Cys Glu Arg Leu Leu 
35 40 45 

Ser Arg Val Ala Ala Leu Phe Pro Ala Leu Arg Pro Gly Gly Phe Gin 
50 55 60 

Ala His Tyr Arg Asp Glu Asp Gly Asp Leu Val Ala Phe Ser Ser Asp 
65 70 75 80 

Glu Glu Leu Thr Met Ala Met Ser Tyr Val Lys Asp Asp lie Phe Arg 
85 90 95 

He Tyr He Lys Glu Lys Lys Glu Cys Arg Arg Asp His Arg Pro Pro 
100 105 110 



Cys Ala Gin Glu Ala Pro Arg Asn Met Val His Pro Asn Val He Cys 
115 120 125 
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Asp Gly Cys Asn Gly Pro Val Val 
130 135 

Cys Pro Asp Tyr Asp Leu Cys Ser 
5 145 150 

Arg Gly His Thr Lys Leu Ala Phe 
165 

10 Glu Gly Phe Ser His Ser Arg Trp 
180 



Gly Thr Arg Tyr Lys Cys Ser Val 
140 

Val Cys Glu Gly Lys Gly Leu His 
155 160 

Pro Ser Pro Phe Gly His Leu Ser 
170 175 

Leu Arg Lys Val Lys His Gly His 
185 190 



Phe Gly Trp Pro Gly Trp Glu Met Gly Pro Pro Gly Asn Trp Ser Pro 
195 200 205 

15 

Arg Pro Pro Arg Ala Gly Glu Ala Arg Pro Gly Pro Thr Ala Glu Ser 
210 215 220 



Ala Ser Gly Pro Ser Glu Asp Pro Ser Val Asn Phe Leu Lys Asn Val 
20 225 230 235 240 

Gly Glu Ser Val Ala Ala Ala Leu Ser Pro Leu Gly lie Glu Val Asp 
245 250 255 

25 He Asp Val Glu His Gly Gly Lys Arg Ser Arg Leu Thr Pro Val Ser 
260 265 270 



Pro Glu Ser Ser Ser Thr Glu Glu Lys Ser Ser Ser Gin Pro Ser Ser 
275 280 285 

30 

Cys Cys Ser Asp Pro Ser Lys Pro Gly Gly Asn Val Glu Gly Ala Thr 
290 295 300 



Gin Ser Leu Ala Glu Gin Met Arg Lys He Ala Leu Glu Ser Glu Gly 
35 305 310 315 320 

Arg Pro Glu Glu Gin Met Glu Ser Asp Asn Cys Ser Gly Gly Asp Asp 
325 330 335 

40 Asp Trp Thr His Leu Ser Ser Lys Glu Val Asp Pro Ser Thr Gly Glu 
340 345 350 



45 



Leu Gin Ser Leu Gin Met Pro Glu Ser Glu Gly Pro Ser Ser Leu Asp 
355 360 365 

Pro Ser Gin Glu Gly Pro Thr Gly Leu Lys Glu Ala Ala Leu Tyr Pro 
370 375 380 



His Leu Pro Pro Glu Ala Asp Pro Arg Leu He Glu Ser Leu Ser Gin 
50 385 390 395 400 

Met Leu Ser Met Gly Phe Ser Asp Glu Gly Gly Trp Leu Thr Arg Leu 
405 410 415 



55 Leu Gin Thr Lys Asn Tyr Asp He Gly Ala Ala Leu Asp Thr He Gin 
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425 



430 



Tyr Ser Lys His Pro Pro Pro Leu 
435 440 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1977 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNES S : s ing 1 e 
(D) TOPOLOGY: linear 

(ill MOLECULE TYPE: CDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..1260 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



CGC CGC TTC AGC TTC TGC TTT AGC 
48 

Arg Arg Phe Ser Phe Cys Phe Ser 
1 5 

GCC GCG CCT GGC CCC CGG CCC TGT 
96 

Ala Ala Pro Gly Pro Arg Pro Cys 
20 

GCG CTC TTT CCT GTG CTC CGG CCC 
144 

Ala Leu Phe Pro Val Leu Arg Pro 
35 40 



CCG GAG CCC GAG GCC GAA GCC GAG 

Pro Glu Pro Glu Ala Glu Ala Glu 
10 15 

GAG CGG CTG CTG AAC CGG GTG GCT 

Glu Arg Leu Leu Asn Arg Val Ala 
25 30 

GGC GGC TTT CAG GCG CAC TAC CGC 

Gly Gly Phe Gin Ala His Tyr Arg 
45 



GAT GAG GAT GGG GAC TTG GTT GCC TTT TCC AGT GAC GAG GAG CTG ACG 
192 

Asp Glu Asp Gly Asp Leu Val Ala Phe Ser Ser Asp Glu Glu Leu Thr 
50 55 60 



ATG GCG ATG TCA TAT GTG AAG GAC GAC ATC TTC CGC ATT TAC ATT AAA 
240 

Met Ala Met Ser Tyr Val Lys Asp Asp lie Phe Arg lie Tyr lie Lys 

65 70 75 80 



GAG AAG AAG GAG TGT CGG AGG GAT CAG CGC CCC TCA TGT GCC CAG GAG 
288 

Glu Lys Lys Glu Cys Arg Arg Asp Gin Arg Pro Ser Cys Ala Gin Glu 
85 90 95 
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GTG CCC AGA AAC 
336 

Val Pro Arg Asn 
100 

GGG CCC GTG GTG 
384 

Gly Pro Val Val 
115 

GAC CTA TTC TCC 
432 

Asp Leu Phe Ser 
130 

AAG CTG GCT TTC 
480 

Lys Leu Ala Phe 
14 5 

CAC AGC CGC TGG 
528 

His Ser Arg Trp 



GCC TGG GAC ATG 
576 

Ala Trp Asp Met 
180 

GCA GGG GAT GCC 
624 

Ala Gly Asp Ala 
195 

TCG GAA CAT CCC 
672 

Ser Glu His Pro 
210 

GCG GCT GCC CTC 
720 

Ala Ala Ala Leu 
225 

ACG CGA GGC AAG 
768 

Thr Arg Gly Lys 



AGC ACA GAG GAG 
816 

Ser Thr Glu Glu 
260 



ATG GTG CAC CCC 
Met Val His Pro 

GGG ACG CGC TAC 

Gly Thr Arg Tyr 
120 

GCC TGC GAG GGC 

Ala Cys Glu Gly 
135 

CCC AGC CCC ATT 

Pro Ser Pro lie 
150 

CTC CGG AAG CTG 

Leu Arg Lys Leu 
165 

GGC ACA CCG GGG 

Gly Thr Pro Gly 

CAC CCT GCC CCT 

His Pro Ala Pro 
200 

AGT GTG AAT TTC 

Ser Val Asn Phe 
215 

AAG CCT CTA GGG 

Lys Pro Leu Gly 
230 

AGA AGC CGC CTG 

Arg Ser Arg Leu 
245 

AAG TGT AGC TCT 
Lys Cys Ser Ser 



-77- 

AAC GTG ATC TGT 

Asn Val He Cys 
105 

AAG TGC AGC GTC 
Lys Cys Ser Val 

AAG GGC CTG CAC 

Lys Gly Leu His 
140 

GGG CAC TTC TCT 

Gly His Phe Ser 
155 

AAA CAT GGG CAA 

Lys His Gly Gin 
170 

AAC TGG AGC CCA 

Asn Trp Ser Pro 
185 

GCC ACG GAA TCA 
Ala Thr Glu Ser 

CTC AAG AAC GTA 

Leu Lys Asn Val 
220 

ATT GAA GTC GAT 

He Glu Val Asp 
235 

ACC CCC ACC TCT 

Thr Pro Thr Ser 
250 

CAG CCA AGC AGC 

Gin Pro Ser Ser 
265 
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GAC GGC TGT AAC 

Asp Gly Cys Asn 
110 

TGC CCT GAC TAC 

Cys Pro Asp Tyr 
125 

CGG GAA CAC GGC 
Arg Glu His Gly 

GAG GGC TTC TCT 

Glu Gly Phe Ser 
16 0 

TTT GGG TGG CCT 

Phe Gly Trp Pro 
175 

CGT CCT CCT CAG 

Arg Pro Pro Gin 
190 

GCC TCT GGT CCA 

Ala Ser Gly Pro 
205 

GGG GAG AGT GTG 
Gly Glu Ser Val 

ATT GTA GTG GAA 

He Val Val Glu 
240 

GCA GGC AGT TCC 

Ala Gly Ser Ser 
255 

TGC TGC TCT GAC 

Cys Cys Ser Asp 
270 
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CCC AGC AAG CCA GAC AGG GAC GTG GAG GGC ACA GCA CAG TCT CTG ACG 
864 

Pro Ser Lys Pro Asp Arg Asp Val Glu Gly Thr Ala Gin Ser Leu Thr 
275 280 285 

GAG CAG ATG AAT AAG ATC GCC CTG GAG TCA GGG GGT CAG CAT GAG GAA 
912 

Glu Gin Met Asn Lys lie Ala Leu Glu Ser Gly Gly Gin His Glu Glu 
290 295 300 

CAG ATG GAG TCT GAT AAC TGT TCA GGA GGA GAT GAT GAC TGG ACT CAT 
960 

Gin Met Glu Ser Asp Asn Cys Ser Gly Gly Asp Asp Asp Trp Thr His 

305 310 315 320 

CTG TCT TCA AAA GAG GTG GAC CCG TCT ACA GGT GAA CTG CAG TCT CTA 
1008 

Leu Ser Ser Lys Glu Val Asp Pro Ser Thr Gly Glu Leu Gin Ser Leu 
325 330 335 

CAG ATG CCT GAG TCT GAA GGG CCA AGC TCT CTG GAT GGT TCC CAG GAA 
1056 

Gin Met Pro Glu Ser Glu Gly Pro Ser Ser Leu Asp Gly Ser Gin Glu 
340 345 350 

GGA CCC ACA GGA CTG AAG GAA GCT GAA CTG TAC CCA CAT CTG CCA CCA 
1104 

Gly Pro Thr Gly Leu Lys Glu Ala Glu Leu Tyr Pro His Leu Pro Pro 
355 360 365 

GAA GCT GAC CCC CGG CTG ATT GAG TCC CTC TCC CAG ATG CTG TCC ATG 
1152 

Glu Ala Asp Pro Arg Leu lie Glu Ser Leu Ser Gin Met Leu Ser Met 
370 375 380 

GTC TCT GAT GAA GGT GGC TGG CTC ACC AGG CTT CTG CAG ACC AAG AAT 
1200 

Val Ser Asp Glu Gly Gly Trp Leu Thr Arg Leu Leu Gin Thr Lys Asn 

385 390 395 400 

TAC GAC ATC GGG GCT GCC CTG AAC ACC ATC CAG TAT TCA AAA CAC CCA 
1248 

Tyr Asp lie Gly Ala Ala Leu Asn Thr lie Gin Tyr Ser Lys His Pro 
405 410 415 

CCA CCT TTG TGACGATGTT TGCTCACCCA TTCTGTGTCC CCTTTGAGTT 
1297 

Pro Pro Leu 

420 

AGTGTAGAAC CCCACTGCCT CTAAGTCCCA ATTTCTCGTC ATTCTTCTTT CAGAATCTGG 
1357 

GGGGTGGGGA TGCAGAAAGC CCTTTAGGGC AGTAGAACAA GTGACACGGG GGGAGTTCCA 
1417 
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AGGGTGTGAG TGCGGATTCT GAGAAACACT GATCAGCTTC CCATGGATGC TGGCTCCTTC 
1477 

CAGCCAGGGG ACCCCGCCCT GGGGCAGAGC GAGAGACTCC TCGCTGGGGA GGACGTGGAG 
1537 

ACCATACTGC ATCTTATCCG TACTCTCCCT GCAGGATTAC ACCAGCAGTC CAGAAGAGAT 
1597 

CTTGCCAAAT GGCTTTCTGC TTTTTCTTTG TATAGGACAC TGATATGTAA CTGATTTTAT 
1657 

GCTAGAAGTT TGATATCCTC TGAATTTAGC TAAAGGATCA CCAGCATTCA CCCCGGGGTG 
1717 

GAAGAGGCTG TCCTGTAGCA ATTACAGCTC AGGACTGTGG CTAACATCTG AGGAATAAAG 
1777 

AAGGGCTGAC AGAGGAACTG ATGCTGTTCA GAGTACTGCC TATTTCATAA CCACTGTAGT 
1837 

TACCGTTTCC AAACCTGTCA GCTG CTTTTA AAGTTAAGAA AATCG CTTTG T AAC C ATT CT 
1897 

ATTTGTAAAC AATTTTAATT AATTAAAGGT ATAAGCACTT TAATCAAAAA AAAAAAAAAA 
1957 

AAATTCCACC ACACTGGCGG 
1977 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 419 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Arg Arg Phe Ser Phe Cys Phe Ser Pro Glu Pro Glu Ala Glu Ala Glu 
15 10 15 

Ala Ala Pro Gly Pro Arg Pro Cys Glu Arg Leu Leu Asn Arg Val Ala 
20 25 30 

Ala Leu Phe Pro Val Leu Arg Pro Gly Gly Phe Gin Ala His Tyr Arg 
35 40 45 

Asp Glu Asp Gly Asp Leu Val Ala Phe Ser Ser Asp Glu Glu Leu Thr 
50 55 60 
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Met Ala Met Ser Tyr Val Lys Asp Asp He Phe Arg He Tyr He Lys 
65 ™ 75 80 

Glu Lys Lys Glu Cys Arg Arg Asp Gin Arg Pro Ser Cys Ala Gin Glu 
5 85 9° 95 



Val Pro Arg Asn Met Val His Pro Asn Val lie Cys Asp Gly Cys Asn 
100 HO 

10 Gly Pro Val Val Gly Thr Arg Tyr Lys Cys Ser Val Cys Pro Asp Tyr 



115 



120 125 



Phe Ser Ala Cys Glu Gly Lys Gly Leu His Arg Glu His Gly 



135 140 



Asp Leu 
130 

15 Lys Leu Ala Phe Pro Ser Pro He Gly His Phe Ser Glu Gly Phe Ser 
!4 5 150 155 160 

His Ser Arg Trp Leu Arg Lys Leu Lys His Gly Gin Phe Gly Trp Pro 
20 165 1™ 175 

Ala Trp Asp Met Gly Thr Pro Gly Asn Trp Ser Pro Arg Pro Pro Gin 

185 190 



180 



25 Ala Gly Asp Ala His Pro Ala Pro Ala Thr Glu Ser Ala Ser Gly Pro 



195 



200 



Ser Glu His Pro Ser Val Asn Phe Leu Lys Asn Val Gly Glu Ser Val 
210 215 220 

Ala Ala Ala Leu Lys Pro Leu Gly He Glu Val Asp He Val Val Glu 
225 230 235 

Thr Arg Gly Lys Arg Ser Arg Leu Thr Pro Thr Ser Ala Gly Ser Ser 
35 245 — 255 



250 



Ser Thr Glu Glu Lys Cys Ser Ser Gin Pro Ser Ser Cys Cys Ser Asp 
260 265 270 

40 Pro Ser Lys Pro Asp Arg Asp Val Glu Gly Thr Ala Gin Ser Leu Thr 



275 



280 



285 



Glu Gin Met Asn Lys He Ala Leu Glu Ser Gly Gly Gin His Glu Glu 
290 295 300 

45 Gin Met Glu Ser Asp Asn Cys Ser Gly Gly Asp Asp Asp Trp Thr His 
305 310 315 

Leu Ser Ser Lys Glu Val Asp Pro Ser Thr Gly Glu Leu Gin Ser Leu 
50 325 330 335 



Gin Met Pro Glu Ser Glu Gly Pro Ser Ser Leu Asp Gly Ser Gin Glu 
340 345 350 

55 Gly Pro Thr Gly Leu Lys Glu Ala Glu Leu Tyr Pro His Leu Pro Pro 
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355 

Glu Ala Asp Pro 
370 

Val Ser Asp Glu 
385 

Tyr Asp lie Gly 
Pro Pro Leu 



360 

Arg Leu He Glu 
375 

Gly Gly Trp Leu 
390 

Ala Ala Leu Asn 
405 



-81- 



Ser Leu Ser Gin 
380 

Thr Arg Leu Leu 
395 

Thr He Gin Tyr 
410 



365 

Met Leu Ser Met 

Gin Thr Lys Asn 
400 

Ser Lys His Pro 
415 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Trp Phe Phe Lys Asn Leu Ser Arg Lys Asp Ala Glu Arg Gin Leu Leu 
15 10 15 

Ala Pro Gly Asn Thr His Gly Ser Phe Leu He Arg Glu Ser Glu Ser 
20 25 30 

Thr Ala Gly Ser Phe Ser Leu Ser Val Arg Asp Phe Asp Gin Asn Gin 
35 40 45 

Gly Glu Val Val Lys His Tyr Lys He Arg Asn Leu Asp Asn Gly Gly 

50 55 60 

Phe Tyr He Ser Pro Arg He Thr Phe Pro Gly Leu His Glu Leu Val 
65 70 75 80 

Arg His Tyr Thr Asn Ala Ser Asp Gly Leu Cys Thr Arg Leu Ser Arg 
85 90 95 



Pro Cys Gin Thr Gin 
100 



(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3901 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 439.. 3847 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGGGCAGCCG TTCTGAGTGG GCCCTCTGCG GGCTCCGCGG CTGGGGTTCC TGGCGGGACC 
60 

GGGGGTCTCT CGGCAGTGAG CTCGGGCCCG CGGCTCCGCC TGCTGCTGCT GGAGAGTGTT 
120 

TCTGGTTTGC TGCAACCTCG AACGGGGTCT GCCGTTGCTC CGGTGCATCC CCCAAACCGC 
180 

TCGGCCCCAC ATTTGCCCGG GCTCATGTGC CTATTGCGGC TGCATGGGTC GGTGGGCGGG 
240 

GCCCAGAACC TTTC AG CTCT TGGGGCATTG GTGAGTCTCA GTAATGCACG TCTCAGTTCC 
300 

ATCAAAACTC GGTTTGAGGG CCTGTGTCTG CTGTCCCTGC TGGTAGGGGA GAGCCCCACA 
360 

GAGCTATTCC AGCAGCACTG TGTGTCTTGG CTTCGGAGCA TTCAGCAGGT GTTACAGACC 
420 

CAGGACCCGC CTGCCACA ATG GAG CTG GCC GTG GCT GTC CTG AGG GAC CTC 
471 

Met Glu Leu Ala Val Ala Val Leu Arg Asp Leu 
15 10 

CTC CGA TAT GCA GCC CAG CTG CCT GCA CTG TTC CGG GAC ATC TCC ATG 
519 

Leu Arg Tyr Ala Ala Gin Leu Pro Ala Leu Phe Arg Asp lie Ser Met 
15 20 25 

AAC CAC CTC CCT GGC CTT CTC ACC TCC CTG CTG GGC CTC AGG CCA GAG 
567 

Asn His Leu Pro Gly Leu Leu Thr Ser Leu Leu Gly Leu Arg Pro Glu 
30 35 40 

TGT GAG CAG TCA GCA TTG GAA GGA ATG AAG GCT TGT ATG ACC TAT TTC 
615 

Cys Glu Gin Ser Ala Leu Glu Gly Met Lys Ala Cys Met Thr Tyr Phe 
45 50 55 
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CCT CGG GCT TGT 
663 

Pro Arg Ala Cys 
60 

TCT AGG GTG GAT 
711 

Ser Arg Val Asp 



TGT TAT TCC CGG 
759 

Cys Tyr Ser Arg 
95 

AAG CAC ACC GAG 
807 

Lys His Thr Glu 
110 

CTG CAC ACC CTG 
855 

Leu His Thr Leu 
125 

GTG CAG AAT GAA 
903 

Val Gin Asn Glu 
140 

GGT GAT GCC CAT 
951 

Gly Asp Ala His 



GCC CGC TGC CTA 
999 

Ala Arg Cys Leu 
175 

TCC GTC CCT GTG 
1047 

Ser Val Pro Val 
190 

GTC AGT AGC AAG 
1095 

Val Ser Ser Lys 
205 

CTT GCT CAG GAT 
1143 

Leu Ala Gin Asp 
220 



GGT TCT CTC AAA 

Gly Ser Leu Lys 
65 

GCC TTG AGC CCT 

Ala Leu Ser Pro 
80 

CTG CCC TCT TTA 

Leu Pro Ser Leu 

AGC TGG GAG CAG 

Ser Trp Glu Gin 
115 

CTG GGG GCC CTG 

Leu Gly Ala Leu 
130 

GGC CCT GGG GTG 

Gly Pro Gly Val 
145 

GTC CTT CTC CAG 

Val Leu Leu Gin 
160 

GGG CTC ATG CTC 
Gly Leu Met Leu 

CAG GAA ATC CTG 

Gin Glu lie Leu 
195 

AAT ATT GTA AGT 

Asn lie Val Ser 
210 

ACC AGG CAA CCA 

Thr Arg Gin Pro 

225 



-83- 

GGC AAG CTG GCC 

Gly Lys Leu Ala 

70 

CAG CTC CAA CAG 

Gin Leu Gin Gin 
85 

GGG GCT GGC TTT 

Gly Ala Gly Phe 
100 

GAG CTA CAC AGT 
Glu Leu His Ser 

TAC GAG GGA GCA 

Tyr Glu Gly Ala 
135 

GAG ATG CTG CTG 

Glu Met Leu Leu 
150 

CTT CGG CAG AGG 

Leu Arg Gin Arg 
165 

AGC TCT GAG TTT 

Ser Ser Glu Phe 
180 

GAT TTC ATC TGC 
Asp Phe lie Cys 

GGG ATT TGT CAT 

Gly He Cys His 
215 

GGA AAG TAC TGG 

Gly Lys Tyr Trp 
230 



TCA TTT TTT CTG 

Ser Phe Phe Leu 
75 

TTG GCC TGT GAG 

Leu Ala Cys Glu 
90 

TCC CAA GGC CTG 

Ser Gin Gly Leu 
105 

CTG CTG GCC TCA 

Leu Leu Ala Ser 
120 

GAG ACT GCT CCT 
Glu Thr Ala Pro 

TCC TCA GAA GAT 

Ser Ser Glu Asp 
155 

TTT TCG GGA CTG 

Phe Ser Gly Leu 
170 

GGA GCT CCC GTG 

Gly Ala Pro Val 
185 

CGG ACC CTC AGC 

Arg Thr Leu Ser 
200 

CTC TTC AGA GCC 
Leu Phe Arg Ala 

GGA CCT GAG TCT 

Gly Pro Glu Ser 

235 
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CCC CAA ACA GTG 
1191 

Pro Gin Thr Val 



GTC CAA ATA ACA 
1239 

Val Gin lie Thr 
255 

CAG AGT GTA GCA 
1287 

Gin Ser Val Ala 
270 

GCT GAG TCA TTG 
1335 

Ala Glu Ser Leu 

285 

GGG TCT ATT TTA 
1383 

Gly Ser He Leu 
300 

TCA GGG GTT GGG 
1431 

Ser Gly Val Gly 



CCT GTT TCT GTC 
1479 

Pro Val Ser Val 
335 

CTC TGC CCC TTT 
1527 

Leu Cys Pro Phe 
350 

TGC TGC TGC TGC 
1575 

Cys Cys Cys Cys 
365 

GCA CTC ATC CTC 
1623 

Ala Leu He Leu 
380 

ATC GGC CGC CTG 
1671 

He Gly Arg Leu 



TCA TCC TGG AGT 

Ser Ser Trp Ser 
240 

TCA CTT CCT ATG 
Ser Leu Pro Met 

AAT GCT TCC TTG 

Asn Ala Ser Leu 
275 

CTG AGA GGC CCA 

Leu Arg Gly Pro 
290 

GAG GAT AGG GGT 

Glu Asp Arg Gly 
305 

TTT CTT ACC TAT 

Phe Leu Thr Tyr 
320 

TCT CTC TGG CTC 
Ser Leu Trp Leu 

TTT CTC CAG AGC 

Phe Leu Gin Ser 
355 

CCT CTA TCC ACC 

Pro Leu Ser Thr 
370 

GCG TGT GGA AGC 

Ala Cys Gly Ser 
385 

CTT CCC CAG GTC 

Leu Pro Gin Val 
400 



-84- 

CCG TCC CAG AGA 

Pro Ser Gin Arg 
245 

TGT CGT GAC ACA 

Cys Arg Asp Thr 
260 

GGG GAG GGT GAA 
Gly Glu Gly Glu 

GCC ATC CTT CTT 

Ala He Leu Leu 
295 

TTG ATT TTG TTG 

Leu He Leu Leu 
310 

GTG TAC ATA TGT 

Val Tyr He Cys 
325 

TCA CTT TCT TCC 

Ser Leu Ser Ser 
340 

TTG CAT GGA GAT 
Leu His Gly Asp 

TTG AAG GCC TTG 

Leu Lys Ala Leu 
375 

CGG CTC TTG CGC 

Arg Leu Leu Arg 
390 

CTC AAT TCC TGG 

Leu Asn Ser Trp 
405 
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GCT TCT ACT TTT 

Ala Ser Thr Phe 
250 

GGA GCA CAG TGT 

Gly Ala Gin Cys 
265 

TTT GGG GAC TCA 

Phe Gly Asp Ser 
280 

ACC TTC CAT CCA 
Thr Phe His Pro 

GGA GAG ATG AGA 

Gly Glu Met Arg 

315 

AAA TGG TCA TTC 

Lys Trp Ser Phe 
330 

TCC ACT CTT TAT 

Ser Thr Leu Tyr 
345 

GGT CCC TGC GGC 

Gly Pro Cys Gly 
360 

GAC CTG CTG TCT 

Asp Leu Leu Ser 

TTT GGG ATC CTG 

Phe Gly He Leu 
395 

AGC ATC GGT AGA 

Ser He Gly Arg 
410 
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GAT TCC CTC TCT 
1719 

Asp Ser Leu Ser 
415 

AAG GTG TAT GCG 
1767 

Lys Val Tyr Ala 
430 

GCG GGA ATG CTT 
1815 

Ala Gly Met Leu 
445 

CTG CTC AGC GAC 
1863 

Leu Leu Ser Asp 
460 

CCG CGG GGG AGC 
1911 

Pro Arg Gly Ser 

CCC AAG AAG CTA 
1959 

Pro Lys Lys Leu 
495 

CAC CGG AAA GGG 
2007 

His Arg Lys Gly 
510 

CTC AGA GGC CTC 
2055 

Leu Arg Gly Leu 
525 

GAG GAG ACT CAC 
2103 

Glu Glu Thr His 
54 0 

GGT GTA CAG CAG 
2151 

Gly Val Gin Gin 



CCT GCC GCC GTG 
2199 

Pro Ala Ala Val 
575 



CCA GGC CAG GAG 
Pro Gly Gin Glu 

ATA TTA GAG CTG 

lie Leu Glu Leu 
435 

CAG GGA GGA GCC 

Gin Gly Gly Ala 
450 

ATC TCC CCG CCA 

lie Ser Pro Pro 
465 

CCT GAT GGG AGT 

Pro Asp Gly Ser 
480 

AAG CTG GAT GTG 
Lys Leu Asp Val 

GAT AGC AAT GCC 

Asp Ser Asn Ala 
515 

AGC CGG ACC ATC 

Ser Arg Thr lie 
530 

AGG AGA CTG CAT 

Arg Arg Leu His 
545 

GGT GAG GTC CTA 

Gly Glu Val Leu 
560 

AAC TCT ACT GCC 
Asn Ser Thr Ala 
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AGG CCT TAC AGC 

Arg Pro Tyr Ser 
420 

TGG GTG CAG GTT 
Trp Val Gin Val 

TCT GGA GAG GCC 

Ser Gly Glu Ala 
455 

GCT GAT GCC CTT 

Ala Asp Ala Leu 
470 

TTG CAG ACT GGG 

Leu Gin Thr Gly 
485 

GGG GAA GCT ATG 

Gly Glu Ala Met 
500 

AAC AGC GAC GTG 
Asn Ser Asp Val 

CTC ATG TGT GGG 

Leu Met Cys Gly 
535 

GAC CTG GTC CTC 

Asp Leu Val Leu 
550 

GGC AGC TCC CCG 

Gly Ser Ser Pro 
565 

TGC TGC TGG CGC 

Cys Cys Trp Arg 
580 



ACG GTT CGG ACC 

Thr Val Arg Thr 
425 

TGT GGG GCC TCG 

Cys Gly Ala Ser 
440 

CTG CTC ACC CAC 
Leu Leu Thr His 

AAG CTG CGT AGC 

Lys Leu Arg Ser 
475 

AAG CCT AGC GCC 

Lys Pro Ser Ala 
490 

GCC CCG CCA AGC 

Ala Pro Pro Ser 
505 

TGT CCG GCT GCA 

Cys Pro Ala Ala 
520 

CCT CTC ATC AAG 
Pro Leu lie Lys 

CCC CTG GTC ATG 

Pro Leu Val Met 
555 

TAC ACG AGC TCC 

Tyr Thr Ser Ser 
570 

TGC TGC TGG CCC 

Cys Cys Trp Pro 
585 
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CGT CTC CTC GCT 
2247 

Arg Leu Leu Ala 
590 

CCC TCG GCC AGC 
2295 

Pro Ser Ala Ser 
605 

GAA GCA CTG GTG 
2343 

Glu Ala Leu Val 
620 

CTG CAG CCC ATG 
2391 

Leu Gin Pro Met 



CTG AGG CCC CAT 
2439 

Leu Arg Pro His 
655 

CCA TGC CCT CAG 
2487 

Pro Cys Pro Gin 
670 

GCA GGC CCC ATG 
2535 

Ala Gly Pro Met 
685 

TCC ACC ACA GCC 
2583 

Ser Thr Thr Ala 
700 

CCT CCC CGG CTT 
2631 

Pro Pro Arg Leu 



GAG GAC CCC ATC 
2679 

Glu Asp Pro lie 
735 

CCA GAT GAA ACT 
2727 

Pro Asp Glu Thr 
750 



GCC CAC CTC CTC 

Ala His Leu Leu 
595 

GAG AAG ATA GCC 

Glu Lys lie Ala 
610 

ACC TGT GCT GCT 

Thr Cys Ala Ala 
625 

GGC CCC ACC TGC 

Gly Pro Thr Cys 
640 

CGC CCT TCA GGG 
Arg Pro Ser Gly 

TGG GCT CCA TGC 

Trp Ala Pro Cys 
675 

CCC TCA GCA GGC 

Pro Ser Ala Gly 
690 

AAC CTC CTA GGC 

Asn Leu Leu Gly 
705 

CTT CCT GGC CCT 

Leu Pro Gly Pro 
720 

CTT GCC CCT AGT 
Leu Ala Pro Ser 

TTT GGG GGG AGA 

Phe Gly Gly Arg 
755 
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TTG CCT GTG CCC 
Leu Pro Val Pro 

TTG AGG TCT CCT 

Leu Arg Ser Pro 
615 

CTG ACC CAC CCC 

Leu Thr His Pro 
630 

CCC ACA CCT GCT 

Pro Thr Pro Ala 
645 

CCC CAC CGT TCC 

Pro His Arg Ser 
660 

CCT CAG CAG GCC 
Pro Gin Gin Ala 

CCT GTG CCC TCG 

Pro Val Pro Ser 
695 

CTT CTG TCC AGG 

Leu Leu Ser Arg 
710 

GAG AAC CAC CGG 

Glu Asn His Arg 
725 

GGG ACT CCC CCA 

Gly Thr Pro Pro 
740 

GTG CCC AGA CCA 
Val Pro Arg Pro 



TGC AAG CCT TCT 

Cys Lys Pro Ser 
600 

CTT TCT TGC TCA 
Leu Ser Cys Ser 

CGG GTT CCT CCC 

Arg Val Pro Pro 
635 

CCA GTC CCC CTC 

Pro Val Pro Leu 
650 

ATC CTC CGG GCC 

lie Leu Arg Ala 
665 

CCA TGC CCT TCA 

Pro Cys Pro Ser 
680 

GAG CCC TGG ACC 
Glu Pro Trp Thr 

CCT AGT GTC TGT 

Pro Ser Val Cys 
715 

GCA GGC TCA AAT 

Ala Gly Ser Asn 
730 

CCT ACT ATA CCC 

Pro Thr lie Pro 
745 

GCC TTT GTC CAC 

Ala Phe Val His 
760 
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TAT GAC AAG GAG 
2775 

Tyr Asp Lys Glu 
765 

TCT GAT GAC AGC 
2823 

Ser Asp Asp Ser 
780 

CCC CCA CCA CCC 
2871 

Pro Pro Pro Pro 



CCA CCA ACA GCC 
2919 

Pro Pro Thr Ala 
815 

CTT CCT GCG GCC 
2967 

Leu Pro Ala Ala 
830 

CCT GTT CCT GGT 
3015 

Pro Val Pro Gly 
845 

GGG ACT CCT GGT 
3063 

Gly Thr Pro Gly 
860 

GTT ATT AAT ATC 
3111 

Val lie Asn lie 



GAG GAA GAA GAA 
3159 

Glu Glu Glu Glu 
895 

GAA GAG GAA GAG 
3207 

Glu Glu Glu Glu 
910 

GAA TAT TTT GAA 
3255 

Glu Tyr Phe Glu 
925 



GAG GCA TCT GAT 

Glu Ala Ser Asp 
770 

GTG GTG ATC GTG 

Val Val lie Val 
785 

TCA GGT GCC ACA 

Ser Gly Ala Thr 
800 

TCC CCT CCT GTG 
Ser Pro Pro Val 

CCA GGG CCT CTC 

Pro Gly Pro Leu 
835 

CCT GTG ACC CTC 

Pro Val Thr Leu 
850 

GGG GGA GGA CCC 

Gly Gly Gly Pro 
865 

AAC AGC AGT GAT 

Asn Ser Ser Asp 
880 

GAA GAA GAA GAA 
Glu Glu Glu Glu 

GAG GAA GAC TTT 

Glu Glu Asp Phe 
915 

GAG GAA GAA GAG 

Glu Glu Glu Glu 
930 



-87- 

GTG GAG ATC TCC 

Val Glu He Ser 
775 

CCC GAG GGG CTT 

Pro Glu Gly Leu 
790 

CCA CCC CCT ATA 

Pro Pro Pro He 
805 

CCA GCG AAG GAG 

Pro Ala Lys Glu 
820 

CCG CCG CCC CCA 
Pro Pro Pro Pro 

CCT CCA CCC CAG 

Pro Pro Pro Gin 
855 

CCA GCC CTG GAA 

Pro Ala Leu Glu 
870 

GAA GAG GAG GAG 

Glu Glu Glu Glu 
885 

GAA GAG GAA GAA 

Glu Glu Glu Glu 
900 

GAG GAA GAG GAA 
Glu Glu Glu Glu 

GAG GAA GAA GAG 

Glu Glu Glu Glu 
935 



PCTAJS96/19944 

TTG GAA AGT GAC 
Leu Glu Ser Asp 

CCC CCC CTG CCA 

Pro Pro Leu Pro 
795 

GCC CCC ACT GGG 

Ala Pro Thr Gly 
810 

GAG CCT GAA GAA 

Glu Pro Glu Glu 
825 

CCT CCG CCG CCG 

Pro Pro Pro Pro 
840 

TTG GTC CCT GAA 
Leu Val Pro Glu 

GAG GAT TTG ACA 

Glu Asp Leu Thr 
875 

GAA GAA GGA GAA 

Glu Glu Gly Glu 
890 

GAA GAA GAG GAA 

Glu Glu Glu Glu 
905 

GAG GAT GAA GAG 

Glu Asp Glu Glu 
920 

TTT GAG GAA GAA 
Phe Glu Glu Glu 
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-88- 

TTT GAG GAA GAA GAA GGT GAG TTA GAG GAA GAA GAA GAA GAG GAG GAT 
3303 

Phe Glu Glu Glu Glu Gly Glu Leu Glu Glu Glu Glu Glu Glu Glu Asp 
940 945 950 955 

GAG GAG GAG GAA GAA GAA CTG GAA GAG GTG GAA GAC CTG GAG TTT GGC 
3351 

Glu Glu Glu Glu Glu Glu Leu Glu Glu Val Glu Asp Leu Glu Phe Gly 
960 965 970 

ACA GCA GGA GGG GAG GTA GAA GAA GGT GCA CCA CCA CCC CCA ACC CTG 
3399 

Thr Ala Gly Gly Glu Val Glu Glu Gly Ala Pro Pro Pro Pro Thr Leu 
975 980 985 

CCT CCA GCT CTG CCT CCC CCT GAG TCT CCC CCA AAG GTG CAG CCA GAA 
3447 

Pro Pro Ala Leu Pro Pro Pro Glu Ser Pro Pro Lys Val Gin Pro Glu 
990 995 1000 

CCC GAA CCC GAA CCC GGG CTG CTT TTG GAA GTG GAG GAG CCA GGG ACG 
3495 

Pro Glu Pro Glu Pro Gly Leu Leu Leu Glu Val Glu Glu Pro Gly Thr 
1005 1010 1015 

GAG GAG GAG CGT GGG GCT GAC ACA GCT CCC ACC CTG GCC CCT GAA GCG 
3543 

Glu Glu Glu Arg Gly Ala Asp Thr Ala Pro Thr Leu Ala Pro Glu Ala 

1020 1025 1030 1035 

CTC CCC TCC CAG GGA GAG GTG GAG AGG GAA GGG GAA AGC CCT GCG GCA 
3591 

Leu Pro Ser Gin Gly Glu Val Glu Arg Glu Gly Glu Ser Pro Ala Ala 
1040 1045 1050 

GGG CCC CCT CCC CAG GAG CTT GTT GAA GAA GAG CCC TCT CCT CCC CCA 
3639 

Gly Pro Pro Pro Gin Glu Leu Val Glu Glu Glu Pro Ser Pro Pro 
1055 1060 1065 

ACC CTG TTG GAA GAG GAG ACT GAG GAT GGG AGT GAC AAG GTG CAG CCC 
3687 

Thr Leu Leu Glu Glu Glu Thr Glu Asp Gly Ser Asp Lys Val Gin Pro 
1070 1075 1080 

CCA CCA GAG ACA CCT GCA GAA GAA GAG ATG GAG ACA GAG ACA GAG GCC 
3735 

Pro Pro Glu Thr Pro Ala Glu Glu Glu Met Glu Thr Glu Thr Glu Ala 
1085 1090 1095 

GAA GCT CTC CAG GAA AAG GAG CAG GAT GAC ACA GCT GCC ATG CTG GCC 
3783 

Glu Ala Leu Gin Glu Lys Glu Gin Asp Asp Thr Ala Ala Met Leu Ala 
1100 H05 1110 1115 
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-89- 

GAC TTC ATC GAT TGT CCC CCT GAT GAT GAG AAG CCA CCA CCT CCC ACA 
3831 

Asp Phe lie Asp Cys Pro Pro Asp Asp Glu Lys Pro Pro Pro Pro Thr 
1120 1125 1130 

GAG CCT GAC TCC TAG C CATCTTCTGC ACCCCACCTC TTTGTTTCCA ATAAAGTTAT 
3887 

Glu Pro Asp Ser * 
1135 

GTCCTTAAAA AAAA 
3901 



(2) INFORMATION FOR SEQ ID NO ; 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1135 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Glu Leu Ala Val Ala Val Leu Arg Asp Leu Leu Arg Tyr Ala Ala 
15 10 15 

Gin Leu Pro Ala Leu Phe Arg Asp He Ser Met Asn His Leu Pro Gly 
20 25 30 

Leu Leu Thr Ser Leu Leu Gly Leu Arg Pro Glu Cys Glu Gin Ser Ala 
35 40 45 

Leu Glu Gly Met Lys Ala Cys Met Thr Tyr Phe Pro Arg Ala Cys Gly 
50 55 60 

Ser Leu Lys Gly Lys Leu Ala Ser Phe Phe Leu Ser Arg Val Asp Ala 
65 70 75 80 

Leu Ser Pro Gin Leu Gin Gin Leu Ala Cys Glu Cys Tyr Ser Arg Leu 
85 90 95 

Pro Ser Leu Gly Ala Gly Phe Ser Gin Gly Leu Lys His Thr Glu Ser 
100 105 110 

Trp Glu Gin Glu Leu His Ser Leu Leu Ala Ser Leu His Thr Leu Leu 
115 120 125 

Gly Ala Leu Tyr Glu Gly Ala Glu Thr Ala Pro Val Gin Asn Glu Gly 
130 135 140 



Pro Gly Val Glu Met Leu Leu Ser Ser Glu Asp Gly Asp Ala His Val 
145 150 155 160 
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Leu Leu Gin Leu Arg Gin Arg Phe Ser Gly Leu Ala Arg Cys Leu Gly 
165 170 175 

Leu Met Leu Ser Ser Glu Phe Gly Ala Pro Val Ser Val Pro Val Gin 
5 180 185 190 

Glu lie Leu Asp Phe He Cys Arg Thr Leu Ser Val Ser Ser Lys Asn 
195 200 205 

10 He Val Ser Gly He Cys His Leu Phe Arg Ala Leu Ala Gin Asp Thr 
210 215 220 

Arg Gin Pro Gly Lys Tyr Trp Gly Pro Glu Ser Pro Gin Thr Val Ser 
225 230 235 240 

15 

Ser Trp Ser Pro Ser Gin Arg Ala Ser Thr Phe Val Gin He Thr Ser 
245 250 255 

Leu Pro Met Cys Arg Asp Thr Gly Ala Gin Cys Gin Ser Val Ala Asn 
20 260 265 270 

Ala Ser Leu Gly Glu Gly Glu Phe Gly Asp Ser Ala Glu Ser Leu Leu 
275 280 285 

25 Arg Gly Pro Ala He Leu Leu Thr Phe His Pro Gly Ser He Leu Glu 
290 295 300 

Asp Arg Gly Leu He Leu Leu Gly Glu Met Arg Ser Gly Val Gly Phe 
305 310 315 320 

30 

Leu Thr Tyr Val Tyr He Cys Lys Trp Ser Phe Pro Val Ser Val Ser 
325 330 335 

Leu Trp Leu Ser Leu Ser Ser Ser Thr Leu Tyr Leu Cys Pro Phe Phe 
35 340 345 350 

Leu Gin Ser Leu His Gly Asp Gly Pro Cys Gly Cys Cys Cys Cys Pro 
355 360 365 

40 Leu Ser Thr Leu Lys Ala Leu Asp Leu Leu Ser Ala Leu He Leu Ala 
370 375 380 

Cys Gly Ser Arg Leu Leu Arg Phe Gly He Leu He Gly Arg Leu Leu 
385 390 395 400 

45 

Pro Gin Val Leu Asn Ser Trp Ser He Gly Arg Asp Ser Leu Ser Pro 
405 410 415 

Gly Gin Glu Arg Pro Tyr Ser Thr Val Arg Thr Lys Val Tyr Ala He 
50 420 425 430 

Leu Glu Leu Trp Val Gin Val Cys Gly Ala Ser Ala Gly Met Leu Gin 
435 440 445 

55 Gly Gly Ala Ser Gly Glu Ala Leu Leu Thr His Leu Leu Ser Asp He 
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450 455 



-91- 

460 



5 



Ser Pro Pro Ala Asp Ala Leu Lys Leu Arg Ser Pro Arg Gly Ser Pro 
Asp Gly Ser Leu Gin Thr Gly Lys Pro Ser Ala Pro Lys Lys Leu Lys 



Leu Asp val Gly Glu Ala Met Ala Pro Pro Ser His Arg Lys Gly Asp 
10 500 505 510 

Ser Asn Ala Asn Ser Asp Val Cys Pro Ala Ala Leu Arg Gly Leu Ser 



515 



520 525 



15 Arg Thr lie Leu Met Cys Gly Pro Leu He Lys Glu Glu Thr His Arg 
530 535 540 

Arg Leu His Asp Leu Val Leu Pro Leu Val Met Gly Val Gin Gin Gly 



550 555 



545 

" Glu Val Leu Gly Ser Ser Pro Tyr Thr Ser Ser Pro Ala Ala Val Asn 



565 



570 



ser Thr Ala Cys Cys Tr P Arg Cys Cys Trp Pro Arg Leu Leu Ala Ala 
25 580 585 590 



His 



Leu Leu Leu Pro Val Pro Cys Lys Pro Ser Pro Ser Ala Ser Glu 



595 



600 



605 



30 Lys He Ala Leu Arg Ser Pro Leu Ser Cys Ser Glu Ala Leu Val Thr 
610 "5 "0 

Cys Ala Ala Leu Thr His Pro Arg Val Pro Pro Leu Gin Pro Met Gly 



630 635 



625 

35 Pro Thr Cys Pro Thr Pro Ala Pro Val Pro Leu Leu Arg Pro His Arg 

645 



650 655 



Pro Ser Gly Pro His Arg Ser He Leu Arg Ala Pro Cys Pro Gin Trp 
40 660 665 670 

Ala Pro Cys Pro Gin Gin Ala Pro Cys Pro Ser Ala Gly Pro Met Pro 



675 



680 



6B5 



45 Ser Ala Gly Pro Val Pro Ser Glu Pro Trp Thr Ser Thr Thr Ala Asn 
690 695 700 

Leu Leu Gly Leu Leu Ser Arg Pro Ser Val Cys Pro Pro Arg Leu Leu 
705 710 715 720 

Pro Gly Pro Glu Asn His Arg Ala Gly Ser Asn Glu Asp Pro lie Leu 
725 730 735 



705 

50 



Ala Pro Ser Gly Thr Pro Pro Pro Thr lie Pro Pro Asp Glu Thr Phe 
55 740 745 750 
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-92- 



Gly Gly Arg Val Pro Arg Pro Ala Phe Val His Tyr Asp Lys Glu Glu 
755 760 765 

5 Ala Ser Asp Val Glu lie Ser Leu Glu Ser Asp Ser Asp Asp Ser Val 



770 



775 780 



Val He Val Pro Glu Gly Leu Pro Pro Leu Pro Pro Pro Pro Pro Ser 
785 "0 795 800 

10 Gly Ala Thr Pro Pro Pro He Ala Pro Thr Gly Pro Pro Thr Ala Ser 

805 810 815 

Pro Pro Val Pro Ala Lys Glu Glu Pro Glu Glu Leu Pro Ala Ala Pro 
15 820 625 "O 

Gly Pro Leu Pro Pro Pro Pro Pro Pro Pro Pro Pro Val Pro Gly Pro 

1 - - - 845 



835 



840 



20 Val Thr Leu Pro Pro Pro Gin Leu Val Pro Glu Gly Thr Pro Gly Gly 



850 



855 



860 



Gly Gly Pro Pro Ala Leu Glu Glu Asp Leu Thr Val lie Asn He Asn 
865 870 

25 



875 880 



Ser Ser Asp Glu Glu Glu Glu Glu Glu Gly Glu Glu Glu Glu Glu Glu 



885 



890 895 



Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
30 900 905 910 



Glu Asp Phe Glu Glu Glu Glu Glu Asp Glu Glu Glu Tyr Phe Glu Glu 
915 920 925 

35 Glu Glu Glu Glu Glu Glu Glu Phe Glu Glu Glu Phe Glu Glu Glu Glu 
930 «5 940 

Gly Glu Leu Glu Glu Glu Glu Glu Glu Glu Asp Glu Glu Glu Glu Glu 
945 950 955 960 

40 Glu Leu Glu Glu Val Glu Asp Leu Glu Phe Gly Thr Ala Gly Gly Glu 

965 970 975 

Val Glu Glu Gly Ala Pro Pro Pro Pro Thr Leu Pro Pro Ala Leu Pro 
45 980 985 990 

Pro Pro Glu Ser Pro Pro Lys Val Gin Pro Glu Pro Glu Pro Glu Pro 
995 1000 1005 

50 Gly Leu Leu Leu Glu Val Glu Glu Pro Gly Thr Glu Glu Glu Arg Gly 
1010 1015 1020 

Ala Asp Thr Ala Pro Thr Leu Ala Pro Glu Ala Leu Pro Ser Gin Gly 
1025 1030 1035 1040 

55 
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Glu Val Glu Arg Glu Gly Glu Ser Pro Ala Ala Gly Pro Pro Pro Gin 
1045 1050 1055 

Glu Leu Val Glu Glu Glu Pro Ser Pro Pro Pro Thr Leu Leu Glu Glu 
5 1060 1065 1070 

Glu Thr Glu Asp Gly Ser Asp Lys Val Gin Pro Pro Pro Glu Thr Pro 
1075 1080 1085 

10 Ala Glu Glu Glu Met Glu Thr Glu Thr Glu Ala Glu Ala Leu Gin Glu 
1090 1095 1100 

Lys Glu Gin Asp Asp Thr Ala Ala Met Leu Ala Asp Phe lie Asp Cys 
1105 1110 1115 1120 

15 

Pro Pro Asp Asp Glu Lys Pro Pro Pro Pro Thr Glu Pro Asp Ser 
1125 1130 1135 



20 (2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 3211 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



30 



35 



40 



(ii) MOLECULE TYPE: cDNA 



( ix ) FEATURE : 

(A) NAME /KEY : CDS 

(B) LOCATION: 439.. 3157 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

GGGGCAGCCG TTCTGAGTGG GCCCTCTGCG GGCTCCGCGG CTGGGGTTCC TGGCGGGACC 
60 

GGGGGTCTCT CGGCAGTGAG CTCGGGCCCG CGGCTCCGCC TGCTGCTGCT GGAGAGTGTT 
120 



TCTGGTTTGC TGCAACCTCG AACGGGGTCT GCCGTTGCTC CGGTGCATCC CCCAAACCGC 
45 180 

TCGGCCCCAC ATTTGCCCGG GCTCATGTGC CTATTGCGGC TGCATGGGTC GGTGGGCGGG 
240 

50 GCCCAGAACC TTTCAGCTCT TGGGGCATTG GTGAGTCTCA GTAATGCACG TCTCAGTTCC 
300 

ATCAAAACTC GGTTTGAGGG CCTGTGTCTG CTGTCCCTGC TGGTAGGGGA GAGCCCCACA 
360 

55 
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-94- 

GAGCTATTCC AGCAGCACTG TGTGTCTTGG CTTCGGAGCA TTCAGCAGGT GTTACAGACC 
420 

CAGGACCCGC CTGCCACA ATG GAG CTG GCC GTG GCT GTC CTG AGG GAC CTC 
471 

Met Glu Leu Ala Val Ala Val Leu Arg Asp Leu 
15 10 

CTC CGA TAT GCA GCC CAG CTG CCT GCA CTG TTC CGG GAC ATC TCC ATG 
519 

Leu Arg Tyr Ala Ala Gin Leu Pro Ala Leu Phe Arg Asp He Ser Met 

15 20 25 

AAC CAC CTC CCT GGC CTT CTC ACC TCC CTG CTG GGC CTC AGG CCA GAG 
567 

Asn His Leu Pro Gly Leu Leu Thr Ser Leu Leu Gly Leu Arg Pro Glu 
30 35 40 

TGT GAG CAG TCA GCA TTG GAA GGA ATG AAG GCT TGT ATG ACC TAT TTC 
615 

Cys Glu Gin Ser Ala Leu Glu Gly Met Lys Ala Cys Met Thr Tyr Phe 
45 50 55 

CCT CGG GCT TGT GGT TCT CTC AAA GGC AAG CTG GCC TCA TTT TTT CTG 
663 

Pro Arg Ala Cys Gly Ser Leu Lys Gly Lys Leu Ala Ser Phe Phe Leu 
60 65 70 75 

TCT AGG GTG GAT GCC TTG AGC CCT CAG CTC CAA CAG TTG GCC TGT GAG 
711 

Ser Arg Val Asp Ala Leu Ser Pro Gin Leu Gin Gin Leu Ala Cys Glu 
80 85 90 

TGT TAT TCC CGG CTG CCC TCT TTA GGG GCT GGC TTT TCC CAA GGC CTG 
759 

Cys Tyr Ser Arg Leu Pro Ser Leu Gly Ala Gly Phe Ser Gin Gly Leu 
95 100 105 

AAG CAC ACC GAG AGC TGG GAG CAG GAG CTA CAC AGT CTG CTG GCC TCA 
807 

Lys His Thr Glu Ser Trp Glu Gin Glu Leu His Ser Leu Leu Ala Ser 
110 115 120 

CTG CAC ACC CTG CTG GGG GCC CTG TAC GAG GGA GCA GAG ACT GCT CCT 
855 

Leu His Thr Leu Leu Gly Ala Leu Tyr Glu Gly Ala Glu Thr Ala Pro 
125 130 135 

GTG CAG AAT GAA GGC CCT GGG GTG GAG ATG CTG CTG TCC TCA GAA GAT 
903 

Val Gin Asn Glu Gly Pro Gly Val Glu Met Leu Leu Ser Ser Glu Asp 
140 145 150 155 

GGT GAT GCC CAT GTC CTT CTC CAG CTT CGG CAG AGG TTT TCG GGA CTG 
951 
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Gly Asp Ala His 



GCC CGC TGC CTA 
999 

Ala Arg Cys Leu 
175 

TCC GTC CCT GTG 
1047 

Ser Val Pro Val 
190 

GTC AGT AGC AAG 
1095 

Val Ser Ser Lys 
205 

TGC TGC TGC CCT 
1143 

Cys Cys Cys Pro 
220 

CTC ATC CTC GCG 
1191 

Leu He Leu Ala 



GGC CGC CTG CTT 
1239 

Gly Arg Leu Leu 
255 

TCC CTC TCT CCA 
1287 

Ser Leu Ser Pro 
270 

GTG TAT GCG ATA 
1335 

Val Tyr Ala He 
285 

GGA ATG CTT CAG 
1383 

Gly Met Leu Gin 
300 

CTC AGC GAC ATC 
1431 

Leu Ser Asp He 



CGG GGG AGC CCT 
1479 



Val Leu Leu Gin 
160 

GGG CTC ATG CTC 
Gly Leu Met Leu 

CAG GAA ATC CTG 

Gin Glu He Leu 
195 

AAT ATT AGC TTG 

Asn He Ser Leu 
210 

CTA TCC ACC TTG 

Leu Ser Thr Leu 
225 

TGT GGA AGC CGG 

Cys Gly Ser Arg 
240 

CCC CAG GTC CTC 
Pro Gin Val Leu 

GGC CAG GAG AGG 

Gly Gin Glu Arg 
275 

TTA GAG CTG TGG 

Leu Glu Leu Trp 
290 

GGA GGA GCC TCT 

Gly Gly Ala Ser 
305 

TCC CCG CCA GCT 

Ser Pro Pro Ala 
320 

GAT GGG AGT TTG 



-95- 

Leu Arg Gin Arg 
165 

AGC TCT GAG TTT 

Ser Ser Glu Phe 
180 

GAT TTC ATC TGC 
Asp Phe He Cys 

CAT GGA GAT GGT 

His Gly Asp Gly 
215 

AAG GCC TTG GAC 

Lys Ala Leu Asp 
230 

CTC TTG CGC TTT 

Leu Leu Arg Phe 
245 

AAT TCC TGG AGC 

Asn Ser Trp Ser 
260 

CCT TAC AGC ACG 
Pro Tyr Ser Thr 

GTG CAG GTT TGT 

Val Gin Val Cys 
295 

GGA GAG GCC CTG 

Gly Glu Ala Leu 
310 

GAT GCC CTT AAG 

Asp Ala Leu Lys 
325 

CAG ACT GGG AAG 



Phe Ser Gly Leu 

170 

GGA GCT CCC GTG 

Gly Ala Pro Val 
185 

CGG ACC CTC AGC 

Arg Thr Leu Ser 
200 

CCC TGC GGC TGC 
Pro Cys Gly Cys 

CTG CTG TCT GCA 

Leu Leu Ser Ala 
235 

GGG ATC CTG ATC 

Gly He Leu He 
250 

ATC GGT AGA GAT 

He Gly Arg Asp 
265 

GTT CGG ACC AAG 

Val Arg Thr Lys 
280 

GGG GCC TCG GCG 
Gly Ala Ser Ala 

CTC ACC CAC CTG 

Leu Thr His Leu 
315 

CTG CGT AGC CCG 

Leu Arg Ser Pro 
330 

CCT AGC GCC CCC 
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Arg Gly Ser Pro 
335 

AAG AAG CTA AAG 
1527 

Lys Lys Leu Lys 
350 

CTC CTC TTG CCT 
1575 

Leu Leu Leu Pro 
365 

ATA GCC TTG AGG 
1623 

lie Ala Leu Arg 
380 

GCT GCT CTG ACC 
1671 

Ala Ala Leu Thr 



ACC TGC CCC ACA 
1719 

Thr Cys Pro Thr 
415 

TCA GGG CCC CAC 
1767 

Ser Gly Pro His 
430 

CCA TGC CCT CAG 
1815 

Pro Cys Pro Gin 
445 

GCA GGC CCT GTG 
1863 

Ala Gly Pro Val 
460 

CTA GGC CTT CTG 
1911 

Leu Gly Leu Leu 



GGC CCT GAG AAC 
1959 

Gly Pro Glu Asn 
495 

CCT AGT GGG ACT 
2007 



Asp Gly Ser Leu 

CTG GAT GTG GGG 

Leu Asp Val Gly 
355 

GTG CCC TGC AAG 

Val Pro Cys Lys 
370 

TCT CCT CTT TCT 

Ser Pro Leu Ser 
385 

CAC CCC CGG GTT 

His Pro Arg Val 
400 

CCT GCT CCA GTC 
Pro Ala Pro Val 

CGT TCC ATC CTC 

Arg Ser lie Leu 
435 

CAG GCC CCA TGC 

Gin Ala Pro Cys 
450 

CCC TCG GAG CCC 

Pro Ser Glu Pro 
465 

TCC AGG CCT AGT 

Ser Arg Pro Ser 
480 

CAC CGG GCA GGC 
His Arg Ala Gly 

CCC CCA CCT ACT 



-96- 

Gln Thr Gly Lys 
340 

GAA GCT ATG GCC 
Glu Ala Met Ala 

CCT TCT CCC TCG 

Pro Ser Pro Ser 
375 

TGC TCA GAA GCA 

Cys Ser Glu Ala 
390 

CCT CCC CTG CAG 

Pro Pro Leu Gin 
405 

CCC CTC CTG AGG 

Pro Leu Leu Arg 
420 

CGG GCC CCA TGC 
Arg Ala Pro Cys 

CCT TCA GCA GGC 

Pro Ser Ala Gly 
455 

TGG ACC TCC ACC 

Trp Thr Ser Thr 
470 

GTC TGT CCT CCC 

Val Cys Pro Pro 
485 

TCA AAT GAG GAC 

Ser Asn Glu Asp 
500 

ATA CCC CCA GAT 



Pro Ser Ala Pro 
345 

CCG CCA AGC CAC 

Pro Pro Ser His 
360 

GCC AGC GAG AAG 
Ala Ser Glu Lys 

CTG GTG ACC TGT 

Leu Val Thr Cys 
395 

CCC ATG GGC CCC 

Pro Met Gly Pro 
410 

CCC CAT CGC CCT 

Pro His Arg Pro 
425 

CCT CAG TGG GCT 

Pro Gin Trp Ala 
440 

CCC ATG CCC TCA 
Pro Met Pro Ser 

ACA GCC AAC CTC 

Thr Ala Asn Leu 
475 

CGG CTT CTT CCT 

Arg Leu Leu Pro 
490 

CCC ATC CTT GCC 

Pro lie Leu Ala 
505 

GAA ACT TTT GGG 
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Pro Ser Gly Thr 
510 

GGG AGA GTG CCC 
2055 

Gly Arg Val Pro 

525 

TCT GAT GTG GAG 
2103 

Ser Asp Val Glu 
540 

ATC GTG CCC GAG 
2151 

He Val Pro Glu 



GCC ACA CCA CCC 
2199 

Ala Thr Pro Pro 
575 

CCT GTG CCA GCG 
2247 

Pro Val Pro Ala 
590 

CCT CTC CCG CCG 
2295 

Pro Leu Pro Pro 
605 

ACC CTC CCT CCA 
2343 

Thr Leu Pro Pro 
620 

GGA CCC CCA GCC 
2391 

Gly Pro Pro Ala 



AGT GAT GAA GAG 
2439 

Ser Asp Glu Glu 
655 

GAA GAA GAA GAG 
2487 

Glu Glu Glu Glu 
670 

GAC TTT GAG GAA 
2535 



Pro Pro Pro Thr 
515 

AGA CCA GCC TTT 

Arg Pro Ala Phe 
530 

ATC TCC TTG GAA 

He Ser Leu Glu 
54 5 

GGG CTT CCC CCC 

Gly Leu Pro Pro 
560 

CCT ATA GCC CCC 
Pro He Ala Pro 

AAG GAG GAG CCT 

Lys Glu Glu Pro 
595 

CCC CCA CCT CCG 

Pro Pro Pro Pro 
610 

CCC CAG TTG GTC 

Pro Gin Leu Val 
625 

CTG GAA GAG GAT 

Leu Glu Glu Asp 
640 

GAG GAG GAA GAA 
Glu Glu Glu Glu 

GAA GAA GAA GAA 

Glu Glu Glu Glu 
675 

GAG GAA GAG GAT 



-97- 

Ile Pro Pro Asp 

GTC CAC TAT GAC 

Val His Tyr Asp 
535 

AGT GAC TCT GAT 

Ser Asp Ser Asp 
550 

CTG CCA CCC CCA 

Leu Pro Pro Pro 
565 

ACT GGG CCA CCA 

Thr Gly Pro Pro 
580 

GAA GAA CTT CCT 

Glu Glu Leu Pro 

CCG CCG CCT GTT 

Pro Pro Pro Val 
615 

CCT GAA GGG ACT 

Pro Glu Gly Thr 
630 

TTG ACA GTT ATT 

Leu Thr Val He 
645 

GGA GAA GAG GAA 

Gly Glu Glu Glu 
660 

GAG GAA GAA GAG 
Glu Glu Glu Glu 

GAA GAG GAA TAT 



Glu Thr Phe Gly 
520 

AAG GAG GAG GCA 
Lys Glu Glu Ala 

GAC AGC GTG GTG 

Asp Ser Val Val 
555 

CCA CCC TCA GGT 

Pro Pro Ser Gly 
570 

ACA GCC TCC CCT 

Thr Ala Ser Pro 
585 

GCG GCC CCA GGG 

Ala Ala Pro Gly 
600 

CCT GGT CCT GTG 
Pro Gly Pro Val 

CCT GGT GGG GGA 

Pro Gly Gly Gly 
635 

AAT ATC AAC AGC 

Asn He Asn Ser 
650 

GAA GAA GAA GAA 

Glu Glu Glu Glu 
665 

GAA GAG GAG GAA 

Glu Glu Glu Glu 
680 

TTT GAA GAG GAA 
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Asp Phe Glu Glu 
685 

GAA GAG GAG GAA 
2583 

Glu Glu Glu Glu 
700 

GAG TTA GAG GAA 
2631 

Glu Leu Glu Glu 



CTG GAA GAG GTG 
2679 

Leu Glu Glu Val 
735 

GAA GAA GGT GCA 
2727 

Glu Glu Gly Ala 
750 

CCT GAG TCT CCC 
2775 

Pro Glu Ser Pro 
765 

CTG CTT TTG GAA 
2823 

Leu Leu Leu Glu 
780 

GAC ACA GCT CCC 
2871 

Asp Thr Ala Pro 



GTG GAG AGG GAA 
2919 

Val Glu Arg Glu 
815 

CTT GTT GAA GAA 
2967 

Leu Val Glu Glu 
830 

ACT GAG GAT GGG 
3015 

Thr Glu Asp Gly 
845 

GAA GAA GAG ATG 
3063 



Glu Glu Glu Asp 
690 

GAA GAG TTT GAG 

Glu Glu Phe Glu 
705 

GAA GAA GAA GAG 

Glu Glu Glu Glu 
720 

GAA GAC CTG GAG 
Glu Asp Leu Glu 

CCA CCA CCC CCA 

Pro Pro Pro Pro 
755 

CCA AAG GTG CAG 

Pro Lys Val Gin 
770 

GTG GAG GAG CCA 

Val Glu Glu Pro 
785 

ACC CTG GCC CCT 

Thr Leu Ala Pro 
800 

GGG GAA AGC CCT 
Gly Glu Ser Pro 

GAG CCC TCT CCT 

Glu Pro Ser Pro 
835 

AGT GAC AAG GTG 

Ser Asp Lys Val 
850 ' 

GAG ACA GAG ACA 



-98- 

Glu Glu Glu Tyr 
695 

GAA GAA TTT GAG 

Glu Glu Phe Glu 
710 

GAG GAT GAG GAG 

Glu Asp Glu Glu 
725 

TTT GGC ACA GCA 

Phe Gly Thr Ala 
740 

ACC CTG CCT CCA 
Thr Leu Pro Pro 

CCA GAA CCC GAA 

Pro Glu Pro Glu 
775 

GGG ACG GAG GAG 

Gly Thr Glu Glu 
790 

GAA GCG CTC CCC 

Glu Ala Leu Pro 
805 

GCG GCA GGG CCC 

Ala Ala Gly Pro 
820 

CCC CCA ACC CTG 
Pro Pro Thr Leu 

CAG CCC CCA CCA 

Gin Pro Pro Pro 
855 

GAG GCC GAA GCT 



Phe Glu Glu Glu 

GAA GAA GAA GGT 

Glu Glu Glu Gly 
715 

GAG GAA GAA GAA 

Glu Glu Glu Glu 
730 

GGA GGG GAG GTA 

Gly Gly Glu Val 
745 

GCT CTG CCT CCC 

Ala Leu Pro Pro 
760 

CCC GAA CCC GGG 
Pro Glu Pro Gly 

GAG CGT GGG GCT 

Glu Arg Gly Ala 
795 

TCC CAG GGA GAG 

Ser Gin Gly Glu 
810 

CCT CCC CAG GAG 

Pro Pro Gin Glu 
825 

TTG GAA GAG GAG 

Leu Glu Glu Glu 
840 

GAG ACA CCT GCA 
Glu Thr Pro Ala 

CTC CAG GAA AAG 
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Glu Glu Glu Met Glu Thr Glu Thr Glu Ala Glu Ala Leu Gin Glu Lys 
860 865 870 875 

GAG CAG GAT GAC ACA GCT GCC ATG CTG GCC GAC TTC ATC GAT TGT CCC 
3111 

Glu Gin Asp Asp Thr Ala Ala Met Leu Ala Asp Phe He Asp Cys Pro 
880 885 890 

CCT GAT GAT GAG AAG CCA CCA CCT CCC ACA GAG CCT GAC TCC TAG C 
3157 

Pro Asp Asp Glu Lys Pro Pro Pro Pro Thr Glu Pro Asp Ser * 
895 900 905 

CATCTTCTGC ACCCCACCTC TTTGTTTCCA ATAAAGTTAT GTCCTTAAAA AAAA 
3211 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 905 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Glu Leu Ala Val Ala Val Leu Arg Asp Leu Leu Arg Tyr Ala Ala 
15 10 15 

Gin Leu Pro Ala Leu Phe Arg Asp He Ser Met Asn His Leu Pro Gly 
20 25 30 

Leu Leu Thr Ser Leu Leu Gly Leu Arg Pro Glu Cys Glu Gin Ser Ala 
35 40 45 

Leu Glu Gly Met Lys Ala Cys Met Thr Tyr Phe Pro Arg Ala Cys Gly 
50 55 60 

Ser Leu Lys Gly Lys Leu Ala Ser Phe Phe Leu Ser Arg Val Asp Ala 
65 70 75 80 

Leu Ser Pro Gin Leu Gin Gin Leu Ala Cys Glu Cys Tyr Ser Arg Leu 
85 90 95 

Pro Ser Leu Gly Ala Gly Phe Ser Gin Gly Leu Lys His Thr Glu Ser 
100 105 110 

Trp Glu Gin Glu Leu His Ser Leu Leu Ala Ser Leu His Thr Leu Leu 
115 120 125 



Gly Ala Leu Tyr Glu Gly Ala Glu Thr Ala Pro Val Gin Asn Glu Gly 
130 135 140 
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Pro Gly val Glu Met Leu Leu Ser Ser Glu Asp Gly Asp Ala His Val 
145 150 1 

L eu Leu Gin Leu Arg Gin Arg Phe Ser Gly Leu Ala Arg Cys Leu Gly 

5 165 170 

L eu Met Leu Ser Ser Glu Phe Gly Ala Pro Val Ser Val Pro Val Gin 
180 "5 

,0 Glu lie Leu Asp Phe lie Cys Arg Thr Leu Ser Val Ser Ser Lys Asn 

195 200 -* Ub 

He Ser Leu His Gly Asp Gly Pro Cys Gly Cys Cys Cys Cys Pro Leu 



215 220 



210 

15 ser Thr Leu Lys Ala Leu Asp Leu Leu Ser Ala Leu He Leu Ala Cy. 



225 



230 



Gly Ser Arg Leu Leu Arg Phe Gly He Leu He Gly Arg Leu Leu Pro 

245 250 
am Val Leu Asn Ser Trp Ser He Gly Arg Asp Ser Leu Ser Pro Gly 



260 265 



25 Gin Glu Arg Pro Tyr Ser Thr Val Arg 



Val Arg Thr Lys Val Tyr Ala He Leu 



275 



280 



285 



30 



Glu Leu Trp Val Gin Val Cys Gly Ala Ser Ala Gly Met: Leu Gin Gly 

290 z ^ 
Gly Ala Ser Gly Glu Ala Leu Leu Thr His Leu Leu Ser Asp He Ser 



305 



310 



Pro Pro Ala Asp Ala Leu Lys Leu Arg Ser Pro Arg Gly Ser Pro Asp 

35 325 330 

Gly Ser Leu Gin Thr Gly Lys Pro Ser Ala Pro Lys Lys Leu Lys Leu 
340 345 

40 Asp Val Gly Glu Ala Met Ala Pro Pro Ser His Leu Leu Leu Pro Val 

355 360 

Pro Cys Lys Pro Ser Pro Ser Ala Ser Glu Lys He Ala Leu Arg Ser 

17c 380 
370 J/b 

45 Pro Leu Ser Cys Ser Glu Ala Leu Val Thr Cys Ala Ala Leu Thr His 
385 39° 395 

Pro Arg Val Pro Pro Leu Gin Pro Met Gly Pro Thr Cys Pro Thr Pro 
50 405 410 

Ala Pro Val Pro Leu Leu Arg Pro His Arg Pro Ser Gly Pro His Arg 
420 425 
55 Ser He Leu Arg Ala Pro Cys Pro Gin Trp Ala Pro Cys Pro Gin Gin 
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435 440 445 

Ala Pro Cys Pro Ser Ala Gly Pro Met Pro Ser Ala Gly Pro Val Pro 
450 455 460 

Ser Glu Pro Trp Thr Ser Thr Thr Ala Asn Leu Leu Gly Leu Leu Ser 
465 470 475 480 

Arg Pro Ser Val Cys Pro Pro Arg Leu Leu Pro Gly Pro Glu Asn His 
485 490 495 

Arg Ala Gly Ser Asn Glu Asp Pro He Leu Ala Pro Ser Gly Thr Pro 
500 505 510 

Pro Pro Thr He Pro Pro Asp Glu Thr Phe Gly Gly Arg Val Pro Arg 
515 520 525 

Pro Ala Phe Val His Tyr Asp Lys Glu Glu Ala Ser Asp Val Glu He 
530 535 540 

Ser Leu Glu Ser Asp Ser Asp Asp Ser Val Val He Val Pro Glu Gly 
545 550 555 560 

Leu Pro Pro Leu Pro Pro Pro Pro Pro Ser Gly Ala Thr Pro Pro Pro 
565 570 575 

He Ala Pro Thr Gly Pro Pro Thr Ala Ser Pro Pro Val Pro Ala Lys 
580 585 590 

Glu Glu Pro Glu Glu Leu Pro Ala Ala Pro Gly Pro Leu Pro Pro Pro 
595 600 605 

Pro Pro Pro Pro Pro Pro Val Pro Gly Pro Val Thr Leu Pro Pro Pro 
610 615 620 

Gin Leu Val Pro Glu Gly Thr Pro Gly Gly Gly Gly Pro Pro Ala Leu 
625 630 635 640 

Glu Glu Asp Leu Thr Val He Asn lie Asn Ser Ser Asp Glu Glu Glu 
645 650 655 

Glu Glu Glu Gly Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
660 665 670 

Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Asp Phe Glu Glu Glu 
675 680 685 

Glu Glu Asp Glu Glu Glu Tyr Phe Glu Glu Glu Glu Glu Glu Glu Glu 
690 695 700 

Glu Phe Glu Glu Glu Phe Glu Glu Glu Glu Gly Glu Leu Glu Glu Glu 
705 710 715 720 



Glu Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Leu Glu Glu Val Glu 
725 730 735 
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Asp Leu Glu Phe Gly Thr Ala Gly Gly Glu Val Glu Glu Gly Ala Pro 
740 745 750 



Ala Leu Pro Pro Pro Glu Ser Pro Pro 
760 765 

LVS Val Gin Pro Glu Pro Glu Pro Glu Pro Gly Leu Leu Leu Glu Val 



5 Pro Pro Pro Thr Leu Pro Pro 
755 



770 775 780 



10 Glu Glu Pro Gly Thr Glu Glu Glu Arg Gly Ala Asp Thr Ala Pro Thr 
7B5 



790 795 800 



Leu Ala Pro Glu Ala Leu Pro Ser Gin Gly Glu Val Glu Arg Glu Gly 
15 805 810 815 

Glu Ser Pro Ala Ala Gly Pro Pro Pro Gin Glu Leu Val Glu Glu Glu 
620 825 830 

20 Pro Ser Pro Pro Pro Thr Leu Leu Glu Glu Glu Thr Glu Asp Gly Ser 
835 * 40 845 

Asp Lys Val Gin Pro Pro Pro Glu Thr Pro Ala Glu Glu Glu Met Glu 
850 855 860 

25 Thr Glu Thr Glu Ala Glu Ala Leu Gin Glu Lys Glu Gin Asp Asp Thr 

875 aou 



865 



870 



Ala Ala Met Leu Ala Asp Phe He Asp Cys Pro Pro Asp Asp Glu Lys 
30 885 890 895 

Pro Pro Pro Pro Thr Glu Pro Asp Ser 
900 905 

35 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 
40 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

45 ( v ) FRAGMENT TYPE: internal 



xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



50 



55 



Trp Leu Arg Lys 
1 

(2) INFORMATION FOR SEQ ID NO: 11: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

He Tyr He Lys Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

Leu Thr Pro Val Ser Pro Glu Ser Ser Ser Thr Glu Glu Lys 
15 10 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

* Asn Val Gly Glu Ser Val Ala Ala Ala Leu Ser Pro Leu Gly He Gin 
15 10 15 

Val Asp He Asp Val Glu His Gly Gly Lys 
20 25 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 
(D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Val Ala Ala Leu Phe Pro Ala Leu Arg Pro Gly Gly Phe Gin Ala His 
15 10 15 

Tyr Arg Asp Glu Asp Gly Asp Leu Val Ala Phe Ser Ser Asp Glu Glu 
20 25 30 

Leu Thr Met Ala Met Ser Tyr Val Lys 
35 40 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
<B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

Gly Ser Pro Asp Gly Ser Leu Gin Thr Gly Lys Pro Ser Ala Pro Lys 
15 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO;16: 

Leu Arg Ser Pro Arg Gly Ser Pro Asp Gly Ser Leu Gin Thr Gly Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Leu Asp Val Gly Glu Ala Met Ala Pro Gin 

15 10 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Glu Gin Asp Asp Thr Ala Ala Val Leu Ala Asp Phe lie Asp 
15 10 

(2} INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(v) FRAGMENT TYPE: internal 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val Gin Pro Glu Pro Glu Pro Glu Pro Gly Leu Leu Leu Glu Val Glu 
15 10 15 

Glu Pro Gly Thr Glu Glu Glu Arg Gly Ala Asp Asp 
20 25 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Val Gin Pro Pro Pro Glu Thr Pro Ala Glu Glu Glu Met Glu Thr Glu 
1 5 10 15 

Thr Glu Ala Glu Ala Leu Gin Glu Lys Glu Gly Gin Asp Asp Ala Ala 
20 25 30 

Ala Met Leu 
35 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 21: 

Val Gin Pro Glu Pro Glu Pro Glu Pro Gly Leu Leu Leu Glu Val Glu 
1 5 10 15 

Glu Pro Gly Thr 
20 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

AGCGGCGGAA TTCCACC 
17 
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CLAIMS 

1 . An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a p62 polypeptide. 

2. The isolated nucleic acid molecule of claim 1, which is a cDNA. 

3. The isolated nucleic acid molecule of claim 2, wherein the p62 
polypeptide is human. 

4. The isolated nucleic acid molecule of claim 3 which comprises a 
nucleotide sequence selected from the group consisting of: 

a) a nucleotide sequence shown in Figure 1 , SEQ ID NO:l ; and 

b) a nucleotide sequence shown in Figure 3, SEQ ID NO:3. 

5. The isolated nucleic acid molecule of claim 4 comprising the coding 

region. 

6. An isolated nucleic acid molecule comprising a nucleotide sequence 
having at least about 60% overall nucleotide sequence identity with a nucleotide 
sequence selected from the group consisting of: 

a nucleotide sequence shown in Figure K SEQ ID NO:l; and 
a nucleotide sequence shown in Figure 3, SEQ ID NO:3. 



a) 
b) 

25 7. The isolated nucleic acid molecule of claim 3 which hybridizes under 

high stringency conditions to a nucleic acid molecule comprising a nucleotide sequence 
selected from the group consisting of: 

a) a nucleotide sequence shown in Figure 1 , SEQ ID NO: I ; and 

b) a nucleotide sequence shown in Figure 3, SEQ ID NO:3. 

30 

8. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a polypeptide having an amino acid sequence selected from the group 
consisting of: 

a) an amino acid sequence shown in Figure 2 7 SEQ ID NO:2; and 
35 b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 
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9. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a ubiquitin binding domain, wherein the nucleotide sequence encoding the 
ubiquitin binding domain is selected from the group consisting of: 

a) nucleotides 1033 to 1386 of the nucleotide sequence shown in 
5 Figure 1 , SEQ ID NO: 1 ; and 

b) nucleotides 907 to 1257 of the nucleotide sequence shown in 
Figure 3, SEQ IDNO:3. 

1 0. An isolated nucleic acid molecule comprising a nucleotide sequence 
10 encoding an SH2 binding domain, wherein the nucleotide sequence encoding the SH2 

binding domain comprises nucleotides 67 to 216 of the nucleotide sequence shown in 
Figure K SEQ ID NO: 1. 

11. An isolated nucleic acid molecule comprising a nucleotide sequence 

15 encoding a zinc finger domain, wherein the nucleotide sequence encoding the zinc finger 
domain is selected from the group consisting of: 

a) nucleotides 448 to 555 of the nucleotide sequence shown in 
Figure 1 , SEQ ID NO: 1 ; and 

b) nucleotides 322 to 429 of the nucleotide sequence shown in 
20 Figure 3, SEQ ID NO:3. 

12. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a GTPase binding domain, wherein the nucleotide sequence encoding the 
GTPase binding domain is selected from the group consisting of: 

25 a) nucleotides 262 to 312 of the nucleotide sequence shown in 

Figure 1,SEQ ID NO: 1; and 

b) nucleotides 136 to 186 of the nucleotide sequence shown in 
Figure 3, SEQIDNO:3. 

30 1 3. An isolated nucleic acid molecule comprising a nucleotide sequence 

encoding a polypeptide wherein the polypeptide comprises an amino acid sequence 
having at least about 70% overall sequence identity with an amino acid sequence 
selected from the group consisting of : 

a) an amino acid sequence shown in Figure 1, SEQ ID NO:2; and 

35 b) an amino acid sequence shown in Figure 2, SEQ ID NO:4. 
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14. The isolated nucleic acid molecule of claim 13, wherein the polypeptide 
has a p62 activity. 

1 5. An isolated nucleic acid molecule comprising a nucleotide sequence 
5 encoding a polypeptide, wherein the polypeptide binds to 

a) ubiquitin, a ubiquitin analog, derivative, or active fragment; and 

b) an SH2 domain wherein the SH2 domain comprises an amino acid 
sequence having at least about 70% sequence identity with the amino acid sequence of 
the SH2 domain ofp56 lck . 

10 

16. The isolated nucleic acid molecule of claim IS, wherein the polypeptide 
binds to the SH2 domain of p56 ,ck . 

1 7. The isolated nucleic acid molecule of claim 1 5, wherein the polypeptide 
15 inhibits ubiquitin-dependent degradation of at least one cell cycle regulatory protein. 

1 8. The isolated nucleic acid molecule of claim 1 5, wherein the polypeptide 
stimulates expression of at least one cell cycle dependent kinase inhibitor. 

20 19. The isolated nucleic acid molecule of claim 15, wherein binding of the 

polypeptide to the SH2 domain is phosphotyrosine independent. 

20. The isolated nucleic acid molecule of claim 15, wherein the polypeptide 
binds to at least one protein involved in the ras cell signaling cascade. 

25 

21. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a polypeptide, wherein the polypeptide binds to 

a) ubiquitin, a ubiquitin analog, derivative, or active fragment; and 

b) the SH2 domain of p56 ick . 

30 

22. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a polypeptide comprising a fragment of at least about 20 amino acids of the 
sequence selected from the group consisting of: 

a) an amino acid sequence shown in Figure 2, SEQ ID NO:2; and 
35 b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 
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23. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a polypeptide comprising a fragment of at least about 20 amino acids of the 
sequence having at least about 70% sequence identity with an amino acid sequence 
selected from the group consisting of: 

5 a) an amino acid sequence shown in Figure 2, SEQ ID NO:2; and 

b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 

24. The isolated nucleic acid molecule of claim 22, wherein the polypeptide 
has a p62 activity. 

10 

25. The isolated nucleic acid molecule of claim 23, wherein the polypeptide 
has a p62 activity. 

26. An isolated nucleic acid molecule which is antisense to the nucleic acid 
1 5 molecule of claim 1 . 

27. An isolated nucleic acid molecule which is antisense to the nucleic acid 
molecule of claim 4. 

20 28. An isolated nucleic acid molecule which is antisense to the nucleic acid 

molecule of claim 5. 

29. A vector comprising a nucleotide sequence encoding a p62 polypeptide. 

25 30. A vector comprising a nucleotide sequence encoding a polypeptide 

comprising an amino acid sequence selected from the group consisting of: 

a) an amino acid sequence shown in Figure 2 y SEQ ID NO:2; and 

b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 

30 31. A host cell comprising the vector of claim 29. 

32. A host cell comprising the vector of claim 30. 

33. A method of producing a p62 polypeptide comprising culturing a host 
35 cell of claim 31 in a suitable medium such that the p62 polypeptide is produced. 
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34. A method of producing a p62 polypeptide comprising culturing a host 
cell of claim 32 in a suitable medium such that the p62 polypeptide is produced. 

35. An isolated polypeptide having a p62 activity. 

5 

36. The isolated polypeptide of claim 35, which is human. 

37. An isolated polypeptide, wherein the polypeptide comprises an amino 
acid sequence selected from the group consisting of: 

10 a) an amino acid sequence shown in Figure 2, SEQ ID NO:2; and 

b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 

38. An isolated polypeptide, wherein the polypeptide comprises an amino 
acid sequence having at least about 70% overall sequence identity with an amino acid 

1 5 sequence selected from the group consisting of : 

a) an amino acid sequence shown in Figure 2, SEQ ID NO:2; and 

b) an amino acid sequence shown in Figure 4, SEQ ID NO:4. 

39. The isolated polypeptide of claim 38, wherein the polypeptide has p62 
20 activity. 

40. An isolated polypeptide, wherein the polypeptide binds to 

a) ubiquitin, a ubiquitin analog, derivative, or active fragment; and 

b) an SH2 domain wherein the SH2 domain comprises an amino acid 
25 sequence having at least about 70% sequence identity with the amino acid sequence of 

the SH2 domain of p56 lck . 

41 . The isolated polypeptide of claim 40, wherein the polypeptide ubiquitin 
binding domain comprises sequence selected from the group consisting of: 

30 a) amino acids 323 to 440 of the amino acid sequence shown in 

Figure 2, SEQ ID NO:2; and 

b) amino acids 303 to 419 of the amino acid sequence shown in 
Figure 4, SEQ ID NO:4. 
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42. The isolated polypeptide of claim 40, wherein the polypeptide SH2 
binding domain comprises amino acids 1 to 50 of the amino acid sequence shown in 
Figure 2, SEQ ID NO:2. 

5 43. The isolated polypeptide of claim 40, further comprising a zinc finger 

domain. 

44. The isolated polypeptide of claim 43, wherein the zinc finger domain 
comprises an amino acid sequence selected from the group consisting of: 
10 a) amino acids 128 to 163 of the amino acid sequence shown in 

Figure 2, SEQ ID NO:2; and 

b) amino acids 108 to 143 of the amino acid sequence shown in 
Figure 4, SEQ IDNO:4. 

1 5 45. The isolated polypeptide of claim 40, further comprising a GTPase 

binding domain. 

46. The isolated polypeptide of claim 45, wherein the GTPase binding 
domain comprises an amino acid sequence selected from the group consisting of: 
20 a) amino acids 66 to 82 of the amino acid sequence shown in Figure 

2, SEQ ID NO:2; and 

b) amino acids 46 to 62 of the amino acid sequence shown in Figure 
4, SEQ ID NO:4. 

25 47. The isolated polypeptide of claim 40, wherein the polypeptide inhibits 

ubiquitin-dependent degradation of at least one cell cycle regulatory protein. 

48. The isolated polypeptide of claim 40, wherein the polypeptide stimulates 
expression of at least one cell cycle dependent kinase inhibitor. 

30 

49. The isolated polypeptide of claim 40, wherein the polypeptide binding to 
the SH2 domain is phosphotyrosine independent. 

50. The isolated polypeptide of claim 40, wherein the polypeptide binds to at 
35 least one protein involved in the ras cell signaling cascade. 
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51. An isolated polypeptide, wherein the polypeptide binds to 

a) ubiquitin, a ubiquitin analog, derivative, or active fragment; and 

b) the SH2 domain of p56 ick . 

5 52. An isolated polypeptide comprising a fragment of at least about 20 amino 

acids of the sequence selected from the group consisting of: 

a) a fragment of an amino acid sequence shown in Figure 2, SEQ ID 

NO:2; and 

b) a fragment of an amino acid sequence shown in Figure 4, SEQ ID 

10 NO:4. 



53. The isolated polypeptide of claim 52, wherein the fragment further 
comprises an amino acid substitution, deletion, or addition. 

15 54. An isolated polypeptide comprising a fragment of at least about 20 amino 

acids of the sequence having at least about 70% sequence identity with fragment of an 
amino acid sequence selected from the group consisting of: 

a) a fragment of an amino acid sequence shown in Figure 2, SEQ ID 

NO:2; and 

20 b) a fragment of an amino acid sequence shown in Figure 4, SEQ ID 

NO:4. 

55. The isolated polypeptide of claim 52, wherein the polypeptide has a p62 
activity. 

25 

56. The isolated polypeptide of claim 54, wherein the polypeptide has a p62 
activity. 



57. The isolated polypeptide of claim 54, wherein the polypeptide comprises 
30 a ubiquitin binding domain. 

58. The isolated polypeptide of claim 54, wherein the polypeptide comprises 
an SH2 binding domain. 



35 



59. A fusion polypeptide comprising a p62 polypeptide and a second 
polypeptide portion having an amino acid sequence from a protein unrelated to an amino 
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acid sequence selected from the group consisting of an amino acid sequence shown in 
Figure 2, SEQ ID NO:2 and an amino acid sequence shown in Figure 4, SEQ ID NO:4. 

60. A pharmaceutical composition comprising the polypeptide of claim 38 
5 and a pharmaceutical^ acceptable carrier. 

61 . A pharmaceutical composition comprising the polypeptide of claim 40 
and a pharmaceutical ly acceptable carrier. 

10 62. A pharmaceutical composition comprising the polypeptide of claim 52 

and a pharmaceutically acceptable carrier. 

63. A vaccine composition comprising the vector of claim 29. 

15 64. A vaccine composition comprising the vector of claim 30. 

65. An antibody which binds a p62 polypeptide or a fragment thereof. 

66. A method for inhibiting cell proliferation in a subject, comprising 

20 administering to the subject a therapeutically effective amount of a p62 polypeptide or 
fragment thereof. 

67. A method for treating cervical cancer in a subject comprising 
administering to the subject a therapeutically effective amount of an agent which 

25 modulates p62 expression. 

68. A method for modulating T cell activity in a subject comprising 
administering to the subject a therapeutically effective amount of an agent which 
activates or inhibits a p62 polypeptide. 

30 

69. A method for identifying an agent which inhibits a p62 polypeptide, 
comprising 

a) contacting a first polypeptide comprising an SH2 domain of 
p56 lck with a second polypeptide comprising a p62 polypeptide and an agent to be 
35 tested; and 
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b) determining binding of the second polypeptide to the first 
polypeptide, wherein an inhibition of binding of the first polypeptide to the second 
polypeptide indicates that the agent is an inhibitor of a p62 polypeptide. 

5 70. A p62 polypeptide inhibitory agent identified according to the method of 

claim 69. 

71 . A method for identifying an agent which activates a p62 polypeptide, 
comprising 

1 0 a) contacting a first polypeptide comprising an SH2 domain of 

p56lck w ith a second polypeptide comprising a p62 polypeptide and an agent to be 
tested; 

b) determining binding of the second polypeptide to the first 
polypeptide wherein an activation of binding of the first polypeptide to the second 
1 5 polypeptide indicates that the agent is an activator of a p62 polypeptide. 

72. A p62 polypeptide activating agent identified according to the method of 
claim 71. 

20 73. A method for identifying an agent which inhibits a p62 polypeptide, 

comprising 

a) contacting a first polypeptide comprising ubiquitin, a ubiquitin 
analog, derivative or active fragment, with a second polypeptide comprising a p62 
polypeptide and an agent to be tested; and 
25 b) determining binding of the second polypeptide to the first 

polypeptide, wherein an inhibition of binding of the first polypeptide to the second 
polypeptide indicates that the agent is an inhibitor of a p62 polypeptide, 

74. A p62 polypeptide inhibitory agent identified according to the method of 
30 claim 73. 

75. A method for identifying an agent which activates a p62 polypeptide, 
comprising 

a) contacting a first polypeptide comprising ubiquitin, a ubiquitin 
35 analog, derivative or active fragment, with a second polypeptide comprising a p62 
polypeptide and an agent to be tested; 
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b) determining binding of the second polypeptide to the first 
polypeptide wherein an activation of binding of the first polypeptide to the second 
polypeptide indicates that the agent is an activator of a p62 polypeptide. 

5 76. A p62 polypeptide activating agent identified according to the method of 

cJaim 75. 

77. A method for identifying an agent which inhibits a p62 polypeptide, 
comprising: 

1 0 a) contacting a first polypeptide comprising p53 protein, p53 analog, 

derivative or active fragment, with a second polypeptide comprising a p62 polypeptide 
and an agent to be tested; 

b) measuring the level of p53 degradation in the presence of the 

agent; and 

1 5 c) comparing the level of p53 degradation in the presence of the 

agent to level of p53 degradation in the absence of the agent, 

wherein an increase in the level of p53 degradation in the presence of the agent indicates 
that the agent is an inhibitor of a p62 polypeptide. 

20 

78. A p62 polypeptide inhibitory agent identified according to the method of 
claim 77. 

79. A method for identifying an agent which activates a p62 polypeptide, 
25 comprising: 

a) contacting a first polypeptide comprising p53 protein, p53 analog, 
derivative or active fragment, with a second polypeptide comprising a p62 polypeptide 
and an agent to be tested; 

b) measuring the level of p53 degradation in the presence of the 

30 agent; and 

c) comparing the level of p53 degradation in the presence of the 
agent to level of p53 degradation in the absence of the agent, 

wherein a decrease in the level of p53 degradation in the presence of the agent indicates 
35 that the agent is an activator of a p62 polypeptide. 
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80. A p62 polypeptide activating agent identified according to the method of 
claim 79. 

81 . An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a pi 60 polypeptide. 

82. The isolated nucleic acid molecule of claim 81 which comprises a 
nucleotide sequence shown in Figure 8, SEQ ID NO:6 or Figure 10, SEQ ID NO:8. 

83. An isolated polypeptide having a pi 60 activity. 

84. The isolated polypeptide of claim 83 which comprising an amino acid 
sequence shown in Figure 9, SEQ ID NO:7 or Figure 11, SEQ ID NO:9 or a fragment 
thereof. 

85. A method for modulating T cell activity in a subject comprising 
administering to the subject a therapeutically effective amount of an agent which 
activates or inhibits a pi 60 polypeptide. 
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p62.seg2 Length: 2083 Type: N Check: 6984 

1 gaattcggca cgaggcgcgg cggctgcgac cgggacggcc cattttccgc 

51 cagctcgccg cccgctatgg cgtcgctcac cgtgaaggcc taccttctgg 

101 gcaaggagga cgcggcgcgc gagattcgcc gcttcagctt ctgctgcagc 

151 cccgagcctg aggcggaagc cgaggctgcg gcgggtccgg gaccctgcga 

201 gcggctgctg agccgggtgg ccgccctgtt ccccgcgctg cggcctggcg 

251 gcttccaggc gcactaccgc gatgaggacg gggacttggt tgccttttcc 

301 agtgacgagg aattgacaat ggccatgtcc tacgtgaagg atgacatctt 

351 ccgaatctac attaaagaga aaaaagagtg ccggcgggac caccgcccac 

401 cgtgtgctca ggaggcgccc cgcaacatgg tgcaccccaa tgtgatctgc 

451 gatggctgca atgggcctgt ggtaggaacc cgctacaagt gcagcgtctg 

501 cccagactac gacttgtgta gcgcctgcga gggaaagggc ttgcaccggg 

551 ggcacaccaa gctcgcattc cccagcccct tcgggcacct gtctgagggc 

601 ttctcgcaca gccgctggct ccggaaggtg aaacacggac acttcgggtg 

651 gccaggatgg gaaatgggtc caccaggaaa ctggagccca cgtcctcctc 

701 gtgcagggga ggcccgccct ggccccacgg cagaatcagc ttctggtcca 

751 tcggaggatc cgagtgtgaa tttcctgaag aacgttgggg agagtgtggc 

801 agctgccctt agccctctgg gcattgaagt tgatatcgat gtggagcacg 

851 gagggaaaag aagccgcctg acccccgtct ctccagagag ttccagcaca 

901 gaggagaaga gcagctcaca gccaagcagc tgctgctctg accccagcaa 

951 gccgggtggg aatgttgagg gcgccacgca gtctctggcg gagcagatga 

1001 ggaagatcgc cttggagtcc gaggggcgcc ctgaggaaca gatggagtcg 

1051 gataactgtt caggaggaga tgatgactgg acccatctgt cttcaaaaga 

1101 agtggacccg Lctacaggtg aactccagtc cctacagatg ccagaatccg 
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1151 


aagggccaag 


ctctctggac 


ccctcccagg 


agggacccac 


agggctgaag 


1201 


gaagctgcct 


tgtacccaca 


tctaccgcca 


gaggctgacc 


cgcggctgat 


1251 


tgagtccctc 


tcccagatgc 


tgtccatggg 


cttctctgat 


gaaggcggct 


1301 


ggctcaccag 


gctcctgcag 


accaagaact 


atgacatcgg 


agcggctctg 


1351 


gacaccatcc 


agtattcaaa 


gcatcccccg 


ccgttgtgac 


cacttttgcc 


1401 


cacctcttct 


gc^tgcccct 


cttctgtctc 


atagttgtgt 


taagcttgcg 


1451 


tagaattgca 


ggtctctgta 


cgggccagtt 


tctctgcctt 


cttccaggat 


1501 


caggggttag 


ggtgcaagaa 


gccatttagg 


gcagcaaaac 


aagtgacatg 


1551 


aagggagggt 


ccctgtgtgt 


gtgtgtgctg 


atgtttcctg 


ggtgccctgg 


1601 


ctccttgcag 


cagggctggg 


cctgcgagac 


ccaaggctca 


ctgcagcgcg 


1651 


ctcctgaccc 


ctccctgcag 


gggctacgtt 


agcagcccag 


cacatagctt 


1701 


gcctaatggc 


t ttcactttc 


tcttttgttt 


taaatgactc 


ataggtccct 


1751 


gacatttagt 


tgattatttt 


ctgctacaga 


cctggtacac 


tctgatttta 


1801 


gataaagtaa 


gcctaggtgt 


tgtcagcagg 


caggctgggg 


aggccagtgt 


1851 


tgtgggcttc 


ctgctgggac 


tgagaaggct 


cacgaagggc 


atccgcaatg 


1901 


ttggtttcac 


cgagagctgc 


ctcctggtct 


cttcaccact 


gtagttctct 


1951 


catttccaaa 


ccatcagctg 


cttttaaaat 


aagatctctt 


tgtagccatc 


2001 


ctgttaaatt 


tgtaaacaat 


ctaattaaat 


ggcatcagca 


ctttaaccaa 


2051 


taaaaaaaaa 


aaaaaaaaaa 


aaaactcgag 


gga 
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p62.pep Length: 44 0 Type: P Check: 164 

1 MASLTVKAYL LGKEDAAREI RRFSFCCSPE PEAEAEAAAG PGPCERLLSR 

51 VAALFPALRP GGFQAHYRDE DGDLVAFSSD EELTMAMSYV KDDIFRIYIK 

101 EKKECRRDHR PPCAQEAPRN MVHPNVICDG CNGPWGTRY KCSVCPDYDL 

151 CSVCEGKGLH RGHTKIAFPS PFGHLSEGFS HSRWLRKVKH GHFGWPGWEM 

2 01 GPPGNWSPRP PRAGEARPGP TAESASGPSE DPSVNFLKNV GESVAAALSP 
251 LGIEVDIDVE HGGKRSRLTP VSPESSSTEE KSSSQPSSCC SDPSKPGGNV 

3 01 EGATQSLAEQ MRKIALESEG RPEEQMESDN CSGGDDDWTH LSSKEVDPST 

3 51 GELQSLQMPE SEGPSSLDPS QEGPTGLKEA ALYPHLPPEA DPRLIESLSQ 

4 01 MLSMGFSDEG GWLTRLLQTK NYDIGAALDT IQYSKHPPPL 



FIG. 2 
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p62daudi.seg Length: 1977 
Check: 2184 . . 



-1 




y ^ !». i. i»- v.. 


i~ a ft n p cct o a cr 


c c a cr aocr c c cr 


a a cr c c cr act err 1 

C* ci.y ^*v-y y 




i-^y ^-*y *— y y v# 


c* rc* acid c c ct 
v*. v»» *-*y y *^ 


cr t cratr c cr cr c t 


cr cc.craar~c*crcr 


cr t~ n cr c t~ cr c or c 
y *-y y y *-»y w 


101 


tctfcfccctcrti 


CCtCCCTQCCC 

Vrf* V> >•» 


ggeggcttte 


aggegcacta 


c cere oat cracr 


151 


y a. u.y yyy au L - 


t c/cr tt ace 1 1 


ttccagtgac 


gaggagctga 


r* cr a t cr crC! era t 




y LUaUuLy L.y 




tefctcegcat 


ttacattaaa 


y c^y d"y «o.y y 


OCT 


agtgtcggag 


g^accagcgc 


ccctcatgtg 


cccaggaggt 


^» A> ^* *T^a SS 

yCCCayaaaC 


3 01 


atggtgcacc 


ecaaegtgat 


ctgtgacggc 


tgtaaeggge 


ccgtggtggg 


351 


gacgcgctac 


aagtgcagcg 


tctgccctga 


ctacgaccta 


ttctccgcct 


401 


gcgagggcaa 


gggcctgcac 


egggaacacg 


gcaagctggc 


tttccccagc 


451 


cccattgggc 


acttctctga 


gggcttctct 


cacagccgct 


ggctccggaa 


501 


gctgaaacat 


gggcaatttg 


ggtggcctgc 


ctgggacatg 


ggcacaccgg 


551 


ggaactggag 


cccacgtcct 


cctcaggcag gggatgecca 


ccctgcccct 


601 


gccacggaat 


cagcctctgg 


tecateggaa 


catcccagtg 


tgaatttcct 


651 


caagaacgta 


ggggagagtg 


tggcggctgc 


cctcaagcct 


ctagggattg 


701 


aagtcgatat 


tgtagtggaa 


aegegaggea 


agagaagecg 


cctgaccccc 


751 


acctctgcag 


gcagttccag 


cacagaggag 


aagtgtagct 


ctcagccaag 
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801 


cagctgctgc 


tcUgacccca 


851 


cacagtctct 


gacggagcag 


901 


cagcatgagg 


aacagatgga 


951 


ctggactcat ctgtcttcaa 


1001 


agtctctaca gatgcctgag 


1051 


caggaaggac 


ccacaggact 


1101 


accagaagct gacccccggc 


1151 


tggtctctga 


tgaaggtggc 


1201 


tacgacatcg 


gggctgccct 


1251 


acctttgtga cgatgtttgc 


1301 


gtagaacccc 


actigcctcta 


1351 


aatctggggg 


gtggggatgc 


1401 


acacgggggg 


agttccaagg 


1451 


cagcttccca 


tggatgctgg 


1501 


gcagagcgag 


agactcctcg 


1551 


ttatccgtac 


tctccctgca 


1601 


gccaaatggc 


tttctgcttt 


1651 


attttatgct 


agaagtttga 


1701 


gcattcaccc 


cggggtggaa 


1751 


actgtGGCTA ACATCTGAGg 


1801 


ctgttcagag 


tactgcctat 


1851 


cctgtcagcfc 


gcttttaaag 


1901 


tgtaaacaat 


tttiaattaat 


1951 


aaaaaaaaaa 


ttccaccaca 



/5 2 

gcaagccaga cagggacgtg gagggcacag 
atgaataaga tcgccctgga gtcagggggt 
gtctgataac tgttcaggag gagatgatga 
aagaggtgga cccgtctaca ggtgaactgc 
tctgaagggc caagctctct ggatggttcc 
gaaggaagct gaactgtacc cacatctgcc 
tgattgagtc cctctcccag atgctgtcca 
tggctcacca ggcttctgca gaccaagaat 
gaacaccatc cagtattcaa aacacccacc 
tcacccattc tgtgtcccct ttgagttagt 
agtcccaatt tctcgtcatt cttctttcag 
agaaagccct ttagggcagt agaacaagtg 
gtgtgagTGC GGATTCTGAG AAAcactgat 
ctccttccag .ccaggggacc ccgccctggg 
ctggggagga cgtggagacc atactgcatc 
ggattacacc agcagtccag aagagatctt 
fctcfcttgtat aggacactga tatgtaactg 
tatcctctga atttagctaa aggatcacca 
gaggctgtcc tgtagcaatt acagctcagg 
aataaagaag ggctgacaga ggaactgatg 
ttcataacca ctgtagttac cgtttccaaa 
ttaagaaaat cgctttgtaa ccattctatt 
taaaggtata agcactttaa tcaaaaaaaa 
ctggcgg 
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p62daudi.pep Length: 4 20 m 
Check: 4 693 .. T ^P e : P 

1 RRFSFCFSPE PEAEAEAAPG PRPCERLLNR VAALFPVLRP GGFQAHYRDE 

51 DGDLVAFSSD EEI.TMAMSYV KDDIFRIYIK EKKECRRDQR PSCAQEVPRN 

101 MVHPNVICDG CNGPWGTRY KCSVCPDYDL FSACEGKGLH REHGKLAFPS 

151 PIGHFSEGFS HSP.WLRKLKH GQFGWPAWDM GTPGNWSPRP PQAGDAHPAP 

201 ATESASGPSE HPSVNFLKNV GESVAAALKP LGIEVDIWE TRGKRSRLTP 

251 TSAGSSSTEE KCSSQPSSCC SDPSKPDRDV EGTAQSLTEQ MNKIALESGG 

301 QHEEQMESDN CSGGDDDWTH LSSKEVDPST GELQSLQMPE SEGPSSLDGS 

3 51 QEGPTGLKEA ELYPHLPPEA DPRLIESLSQ MLSMVSDEGG WLTRLLQTKN 

401 YDIGAALNTI QYSKHPPPL* 



FIG. 4 



SUBSTITUTE SHEET (RULE 26) 



WO 97/22255 



PCT/US96/19944 



7/52 



127 WFFKNLSRKD AERQLLAPGN THGSFLIRES ESTAGSFSLS VRDFDQNQGE 176 
177 WKHYKIRNL DNGGFYISPR 1TFPGLHELV RHYTNASDGL CTRLSRPCQT 226 
227 Q 
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p62.seg2 x p62daudi.seg 



101 gcaaggaggacgcggcgcgcgagattcgccgcttcagcttctgctgcagc ISO 

1 1 1 1 1 i f 1 1 ! 1 1 1 1 1 1 i 1 1 III 

1 cgccgcttcagcttctgcttuagc 24 

151 cccgagcctgaggcggaagccgaggctgcggcgggtccgggaccctgcga 200 

I! IMM Mill 1 1 1 1 f 1 1 1 1 i I III I II II I HIM II 

25 ccggagcccgaggccgaagccgaggccgcgcctggcccccggccctgtga 74 
201 gcggctgctgagccgggtggccgccctgttccccgcgctgcggcctggcg 250 

IMIIMIMI IIMMMI II M II II I Ml Mill IMI 

75 gcggctgccgaaccgggtggctgcgctctttcctgtgctccggcccggcg 124 



FIG. 6 A 



SUBSTITUTE SHEET (RULE 26) 



WO 97/22255 PCI7US96/19944 

9/52 

251 gcttccaggcgcactaccgcgatgaggacggggacttggttgccttttcc 300 

mi iii iMiimiiiMiimii iiniiiiiiiiiiiiiiiN 

125 gctttcaggcgcactaccgcgatgaggatggggacttggttgccttttcc 174 
301 agtgacgaggaattgacaatggccatgtcctacgtgaaggatgacatctt 350 

iiiiiiiiiu mi inn inn u iiiiiMi iiiimi 

175 agtgacgaggagctgacgatggcgatgtcatatgtgaaggacgacatctt 224 

• * • * ■ 
351 ccgaatctacattaaagagaaaaaagagtgccggcgggaccaccgcccac 400 

in ii iimiiiiimi ii imi m mi it iiiii 

225 ccgcatttacattaaagagaagaaggagtgtcggagggatcagcgcccct 274 

» • * • * 

401 cgtgtgctcaggaggcgccccgcaacatggtgcaccccaafcgfcgatctgc 450 

i mil mimii mi i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mmii 

275 catgtgcccaggaggtgcccagaaacatggtgcaccccaacgtgatctgt 324 
451 gatggctgcaatgggcctgtggtaggaacccgctacaagtgcagfcgtctg 500 

it nm ii mn mil ii irimiiimimmm 

325 gacggctgtaacgggcccgtggtggggacgcgctacaagtgcagcgtctg 374 
501 cccagactacgacttgtgtagcgtctgcgagggaaagggcttgcaccggg 550 

III llillllll I I 11 IIIUMil (Mill IMIIIIII 

375 ccctgactacgacctattctccgcctgcgagggcaagggcctgcaccggg 424 
551 agcacaccaagctcgcattccccagccccttcgggcacctgtctgagggc 600 

m iimi ii imiiimii i nun i iiiiuin 

425 aacacggcaagctggctttccccagccccattgggcacttctctgagggc 474 
601 ttctcgcacagccgccgcjctccggaaggtgaaacacggacacttcgggtg 650 

mn u mn i ii 1 1 ii mn n mini n u u mn 

475 ttctctcacagccgccggctccggaagctgaaacatgggcaatttigggtg 524 
651 gccaggatgggaaatgggtccaccaggaaactggagcccacgtcctcctc 700 

m i nm mn mi n imiuimiiimmii 

525 gcctgcctgggacatgggcacaccggggaactggagcccacgtcctcctc 574 
701 gtgcaggggaggcccgccctggccccacggcagaatcagcttctggtcca 750 

mum ini urn in i i minim linuiu 

575 aggcaggggatgcccaccctgcccctgccacggaatcagcctctggtcca 624 

• • • • 

751 tcggaggatccgagtgtgaatttcctgaagaacgttggggagagtgtggc 800 

nm mi umiimmi iimmm uiumimu 

625 tcggaacatcccagtgtgaatttcctcaagaacgtaggggagagtgtggc 674 
801 agctgcccttagccctctgggcattgaagttgatatcgatgtggagcacg 850 

mum i nm u muni iiiii i iimi 

675 ggctgccctcaagcctctagggattgaagtcgatattgtagtggaaacgc 724 

• • * - 

851 gagggaaaagaagccgcctgacccccgtctctccagagagttccagcaca 900 

mi u ii Mi! ii i ii ii iimi mi iii minium 

725 gaggcaagagaagccgcctgacccccacctctgcaggcagttccagcaca 774 

• ■ • * 

901 gaggagaagagcagctcacagccaagcagctgctgctctgaccccagcaa 950 

iiiniiii i nm imiimiiiimiiimmnmii 

FIG. 6B 
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775 gaggagaagtgtagctctcagccaagcagctgctgctctgaccccagcaa 824 

* - ~ » 

951 gccgggtgggaatgttgagggcgccacgcagtctctggcggagcagatga 1000 

in i n i it linn I i iiiiiiiii illinium 

825 gccagacagggacgtggagggcacagcacagtctctgacggagcagatga 874 
. • • • * 

1001 ggaagatcgccttggagtccgaggggcgccctgaggaacagatggagtcg 1050 

iiiiiiiii miii! i in i i iiiiiiiiiiiiiinii 

875 ataagatcgccctggagtcagggggtcagcatgaggaacagatggagtct 924 
1051 gataactgttcaggaggagatgatgactggacccatctgtcttcaaaaga 1100 

lllllllllllllllMlillllillllllll lllllllllllllllll 

925 gataactgttcaggaggagatgatgactggactcatctgtcttcaaaaga 974 

* v • • • ■ 

1101 agtggacccgtctacaggtgaactccagtccctacagatgccagaatccg 1150 

in 1 1 ii in miii ii nun inn ii 1 1 mi m u n i 

975 ggtggacccgtctacaggtgaactgcagtctctacagatgcctgagtctg 1024 
1151 aagggccaagctctctggacccctcccaggagggacccacagggctgaag 1200 

mil in Miiiiinii immi mmum mm 

1025 aagggccaagctctctggatggttcccaggaaggacccacaggactgaag 1074 

« m m. « • 

1201 gaagctgccttgtacccacatctaccgccagaggctgacccgcggctgat 1250 

imiii iiiiiniimi n mn mum mum 

1075 gaagctgaactgtacccacatzctgccaccagaagctgacccccggctgat: 1124 
1251 tgagtccctctccc«igatgctgtccatgggcttctctgatgaaggcggct 1300 

ii i u i n ii iiiiini ii iiiiiiii i i ii ii ii ill 1 1 1 mi 

1125 tgagtccctctcccagatgctgtccatgg- . . tctctgatgaaggtggct 1171 

p • - m 

1301 ggctcaccaggctcctgcagaccaagaactatgacatcggagcggctctg 1350 

miiimmi mi 1 1 imiii i n immi u n m 

1172 ggctcaccaggcttctgcagaccaagaattacgacatcggggctgccctg 1221 
1351 gacaccatccagtattcaaagcatcccccgccgttgtgaccactttfcgcc 1400 

imiimmmmi u n n n mini i nm 

1222 aacaccatccagtattcaaaacacccaccacctttgtgacgatgtttgct 1271 

• - • » • 
1401 cacctcttctgcntgcccctcttctgtctcatagttgtgttaagcttgcg 1450 

mi mn m ii n i m 

1272 cacccattctgtgtcccc tttgagttagtg 1301 

• - - * 
1451 tagaattgcaggtctctgtacgggccagtttctctgccttcttc c 1495 

nm n i n ii i in mm i mm 

1302 tagaacccca- ctgcctctaagtcccaatttctcgtcattcttctttcag 1350 
1496 aggatcaggggttagggtgcaagaagccatttagggcagcaaaacaagtg 1545 

i i nm ii mi nm Minimi i mum 

1351 aatctggggggtggggatgcagaaagccctttagggcagtagaacaagtg 1400 

a • « • * 

1546 acatgaagggagggtc . . .cctgtgtgtgtgtgtgctga 1581 

m i nm i mi in i i im 

1401 acacggggggagttccaagggtgtgagTGCGGATTCTGAGAAAcactgat 1450 
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1582 . tgtt tec tggg tgece tggc tec ttgcagcaggg c tggg 1620 

I mi i iniiiiiii mi n him 

1451 cagcttcccatggatgctggctccttccagccaggggaccccgccctggg 1500 

• • * w • 

1621 cctgcgagacccaaggctcactgcagcg c 1649 

i i i n i in in i i i 

1501 gcagagcgagagactcctcgctggggaggacgtggagaccatactgcatc 1550 

• • * . » 
1650 gctcctgacccctccctgcaggggctacgttagcagcccagcacatagct 1699 

I i I ! 1 1 1 1 1 1 1 1 II III Mill Nil Mill 

1551 ttatccgtactctccctgca.ggattacaccagcagtccagaagagatct 1599 

• • • * « 
1700 tgcctaatggctttcactttctcttttgttttaaatgactcataggtccc 1749 

ill! illllimi II I Mill Mill I ! 

1600 tgccaaatggctttctgctttttctttgt. _ ataggacac 1637 

1750 tgacatttagttgattattttctgctacagacctggtacactctgatttt 1799 

III II II II Hill Mill I II II 11(111 III 

1638 tgatatgtaactg. . .attttatgctagaagtttgatatcctctgaattt 1684 

• « * • * 
1800 agataaagtaagcctaggtgttgtcagcaggcaggctggggaggcc. . .a 1846 

II HIM I I II I I II I II Ml I 

1685 agctaaaggatcaccagcattcaccccggggtggaagaggctgtcctgta 1734 

• • • - • 
1847 gtgttgtgggcttcctgctgggactga gaaggctcacgaagggca 1891 

I I Ml III MM III M MUM! 

1735 gcaattacagctcaggactgtGGCTAACATCTGAGgaataaagaagggct 1784 
1892 tccgcaatgttggtttcactgagagctgcctcctggtctcttcaccactg 1941 

II II I I II M I MM lllllll 

1785 gaeagaggaactgatgetgt . tcagagtactgcctatttcataaccactg 1833 
1942 tagttctctcatttccaaaccatcagctgcttttaa aataagatct 1987 

Mill ( II I II I 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 II III 

1834 tagtt • accgtttccaaacctgtcagctgcttttaaagttaagaaaatcg 1882 

■ « • * 

1988 ctttgtagccatcctgttaaatttgtaaacaatctaattaaatggcatca 2037 

IIMMI MM II II II 1 1 1 1 1 1 1 1 M II I 

1883 ctttgtaaccattctatttgtaaacaattttaattaattaaa.ggtataa 1931 

• * • • 

2038 gcactttaaccaataaaaaaaaaaaaaaaaaaaaaaactcgaggga 2083 

MIIIMM III IIIIIMIMIMM I Ml II 

1932 gcactttaatcaaaaaaaaaaaaaaaaaattccaccacactggcgg 1977 
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p62.pep x p62daudi .pep 



• • • • • 

1 MASLTVKAYLLGKEDAAREIRRFSFCCSPEPEAEAEAAAGPGPCERLLSR 50 

linn 1 1 in 1 1 nihil i ii mm 

1 RRFSFCFSPEPEAEAEAAPGPRPCERLLNR 3 0 

• • • • . 

51 VAALFPALRPGGFQAHYRDEDGDLVAFSSDEELTMAMSYVKDDIFRIYIK 100 

i I i I II - 1 i 1 1 1 1 ! I M M I M M M i 1 1 i i 1 1 1 it M 1 M i ! M I i 1 1 1 

31 VAALFPVLRPGGFQAJIYRDEDGDLVAFSSDEELTMAMSYVKDDIFRIYIK 80 



101 EKKECRRDHRPPCAQEAPRNMVHPNVICDGCNGPWGTRYKCSVCPDYDL 150 

I II M I i I = 1 1 M 1 1 1 -1 1 1 II M 1 1 1 1 1 1 II M I II 1 1 II 1 1 M Ml 1 1 

81 EKKECRRDQRPSCAQEVPRNMVHPNVICDGCNGPWGTRYKCSVCPDYDL 130 

151 CSVCEGKGLHRGHTKLAFPSPFGHLSEGFSHSRWLRKVKHGHFGWPGWEM 200 

I M I II I I 1 h I M I II I I M I h II M I I II I I I h [ I I = I I I I : M I 
131 FSACEGKGLHREHGKLAFPSPIGHFSEGFSHSRWLRKLKHGQFGWPAWDM 180 



201 GPPGNWSPRPPRAGEARPGPTAESASGPSEDPSVNFLKNVGESVAAALSP 250 

Ml INI 1 1 MM MhMMM Mill I Ml Ml I MM Ml II MIM 

181 GTPGNWSPRPPQAGDAHPAPATESASGPSEHPSVNFLK^A^GESVA7\ALKP 23 0 



251 LGIEVDIDVEHGGKRSRLTPVSPESSSTEEKSSSQPSSCCSDPSKPGGNV 300 

MMIM I! M I M M M M Mi II II! M II i II 1 1 M I M I : I 

231 LGIEVDIWETRGKRSRLTPTSAGSSSTEEKCSSQPSSCCSDPSKPDRDV 280 

. • • » • 

301 EGATQSLAEQMRKIALESEGRPEEQMESDNCSGGDDDWTHLSSKEVDPST 350 

1 1 i ii- 1 i U i III h I ■ • I M 1 1 1 Ml i II 1 1 i ! I U III MM 1 1 

281 EGTAQSLTEQMNKIALESGGQHEEQMESDNCSGGDDDWTHLSSKEVDPST 330 
351 GELQSLQMPESEGPSSLDPSQEGPTGLKEAALYPHLPPEADPRLIESLSQ 400 

MM MUM M M I MMI I Mill II I MMM MM MM llllll I 

331 GELQSLQMPESEGPSSLDGSQEGPTGLKEAELYPHLPPEADPRLIESLSQ 380 



401 MLSMGFSDEGGWLTRLLQTKNYDIGAALDTIQYSKHPPPL. 440 

Ml! M MIM MM MM MMIMMMIMMMM 

381 MLSM. VSDEGGWLTRLLQTKNYDIGAALNTIQYSKHPPPL* 420 
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pi 60 DNA sequence 

plGOdna Length: 3901 Type: N Check: 

3842 

1 ggggcagccg ttctgagtgg gccctctgcg ggctccgcgg ctggggttcc 

51 tggcgggacc gggggtctct cggcagtgag ctcgggcccg cggctccgcc 

101 tgctgctgct ggagagtgtt tctggtttgc tgcaacctcg aacygggtct 

151 gccgttgctc cggtgcatcc cccaaaccgc tcggccccac atttgcccgg 

201 gctcatgtgc ctattgcggc tgcatgggtc ggtgggcggg gcccagaacc 

251 tttcagctct tggggcattg gtgagtctca gtaatgcacg tctcagttcc 

301 atcaaaactc ggtttgaggg cctgtgtctg ctgtccctgc tggtagggga 

351 gagccccaca gagctattcc agcagcactg tgtgtcttgg cttcggagca 

401 ttcagcaggt gttacagacc caggacccgc ctgccacaat ggagctggcc 

4 51 gtggctgtcc tgagggacct cctccgatat gcagcccagc tgcctgcact 

501 gttccgggac atctccatga accacctccc tggccttctc acctccctgc 

551 tgggcctcag gccagagtgt gagcagtcag cattggaagg aatgaaggct 

601 tgtatgacct atttccctcg ggcttgtggt tctctcaaag gcaagctggc 

651 ctcatttttt ctgtctaggg tggatgcctt gagccctcag ctccaacagt 

701 tggcctgtga gtgttattcc cggctgccct ctttaggggc tggcttttcc 

751 caaggcctga agcacaccga gagctgggag caggagctac acagtctgct 

801 ggcctcactg cacaccctgc tgggggccct gtacgaggga gcagagactg 

851 ctcctgtgca gaatgaaggc cctggggtgg agatgctgct gtcctcagaa 

901 gatggtgatg cccatgtcct tctccagctt cggcagaggt tttcgggact 

951 ggcccgctgc ctagggctca tgctcagctc tgagtttgga gctcccgtgt 

1001 ccgtccctgt gcaggaaatc ctggatttca tctgccggac cctcagcgtc 

1051 agtagcaaga atantgtaag tgggatttgt catctcttca gagcccttgc 
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1101 


tcaggatacc 


aggcaaccag 


gaaagtactg 


gggacctgag 


tctccccaaa 


1151 


cagtgtcatc 


ctggagtccg 


tcccagagag 


cttctacttt 


tgtccaaata 


1201 


acatcacttc 


ctatgtgtcg 


tgacacagga 


gcacagtgtc 


agagtgtagc 


1251 


aaatgcttcc 


ttgggggagg 


gtgaatttgg 


ggactcagct 


gagtcattgc 


1301 


tgagaggccc 


agccatcctt 


cttaccttcc 


atccagggtc 


taLtttagag 


1351 


gataggggtt 


tgattttgtt 


gggagagatg 


agatcagggg 


t.tgggtttct 


1401 


tacctatgtg 


tacatatgta 


aatggtcatt 


ccctgtttct 


gtctctctct 


1451 


ggctctcact 


ttcttcctcc 


actctttatc 


tctgcccctt 


ttt tctccag 


1501 


agcttgcatg 


gagatggtcc 


ctgcggctgc 


tgctgctgcc 


ctctatccac 


1551 


cttgaaggcc 


ttggacctgc 


tgtctgcact 


catcctcgcg 


tgtggaagcc 


1601 


ggctcttgcg 


ctttgggatc 


ctgatcggcc 


gcctgcttcc 


ccaggtcctc 


1651 


aattcctgga 


gcatcggtag 


agattccctc 


tctccaggcc 


aggagaggcc 


1701 


ttacagcacg 


gttcggacca 


aggtgtatgc 


gatattagag 


ctgtgggtgc 


1751 


aggtttgtgg 


ggcctcggcg 


ggaatgcttc 


agggaggagc 


ctctggagag 


1801 


gccctgctca 


cccacctgct 


cagcgacatc 


tccccgccag 


ctgatgccct 


1851 


taagctgcgt 


agcccgcggg 


ggagccctga 


tgggagtttg 


cagactggga 


1901 


agcctagcgc 


ccccaagaag 


ctaaagctgg 


atgtggggga 


agctatggcc 


1951 


ccgccaagcc 


accggaaagg 


ggatagcaat 


gccaacagcg 


acgtgtgtcc 


2001 


ggctgcactc 


agaggcctca 


gccggaccat 


cctcatgtgt 


gggcctctca 


2051 


tcaaggagga 


gactcacagg 


agactgcatg 


acctggtcct 


ccccctggtc 


2101 


atgggtgtac 


agcagggtga 


ggtcctaggc 


agctccccgt 


acacgagctc 


2151 


ccctgccgcc 


gtgaactcta 


ctgcctgctg 


ctggcgctgc 


tgctggcccc 


2201 


gtctcctcgc 


tgcccacctc 


ctcttgcctg 


tgccctgcaa 


gccttctccc 


2251 


tcggccagcg 


agaagatagc 


cttgaggtct 


cctctttctt 


gctcagaagc 
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2301 


actggtgacc 


tgtgctgctc 


tgacccaccc 


ccgggt tcct 


cccctgcagc 


2351 


ccatgggccc 


cacctgcccc 


acacctgctc 


cagtccccct 


cctgaggccc 


2401 


catcgccctt 


cagggcccca 


ccgttccatc 


ctccgggccc 


catgccctca 


2451 


gtgggctcca 


tgccctcagc 


aggccccatg 


cccttcagca 


ggccccatgc 


2501 


cctcagcagg 


ccctgtgccc 


tcggagccct 


ggacctccac 


cacagccaac 


2551 


ctcctaggcc 


ttctgtccag 


gcctagtgtc 


tgtcctcccc 


ggcttcttcc 


2601 


tggccctgag 


aaccaccggg 


caggctcaaa 


tgaggacccc 


atccttgccc 


2651 


ctagtgggac 


tcccccacct 


actatacccc 


cagatgaaac 


ttttgggggg 


2701 


agagtgccca 


gaccagcctt 


tgtccactat 


gacaaggagg 


aggcatctga 


2751 


tgtggagatc 


tccttggaaa 


gtgactctga 


tgacagcgtg 


gtgatcgtgc 


2801 


ccgaggggct 


tccccccctg 


ccacccccac 


caccctcagg 


tgccacacca 


2851 


ccccctatag 


cccccactgg 


qccgccaaca 


gcctcccctc 


ctgtgccagc 


2901 


gaaggaggag 


cctgaagaac 


ttcctgcggc 


cccagggcct 


ctcccgccgc 


2951 


ccccacctcc 


gccgccgcct 


gttcctggtc 


ctgtgacc ct 


ccctccaccc 


3001 


cagttggtcc 


ctgaagggac 


tcctggtggg 


ggaggacccc 


cagccctgga 


3051 


agaggatttg 


acagttatta 


atatcaacag 


cagtgatgaa 


gaggaggagg 


3101 


aagaaggaga 


agaggaagaa 


gaagaagaag 


aagaagaaga 


ggaagaagaa 


3151 


gaagaggaag 


aagaggaaga 


ggaggaagac 


tttgaggaag 


aggaagagga 


3201 


tgaagaggaa 


tattttgaag 


aggaagaaga 


ggaggaagaa 


gagtttgagg 


3251 


aagaatttga 


ggaagaagaa 


ggtgagttag 


aggaagaaga 


agaagaggag 


3301 


gatgaggagg 


aggaagaaga 


actggaagag 


gtggaagacc 


tggagtttgg 


3351 


cacagcagga 


ggggaggtag 


aagaaggtgc 


accaccaccc 


ccaaccctgc 


3401 


ctccagctct 


gcctccccct 


gagtctcccc 


caaaggtgca 


gccagaaccc 


3451 


gaacccgaac 


ccgggctgct 


tttggaagtg 


gaggagccag 


ggacggagga 
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3501 


ggagcgtggg 


gctgacacag 


ctcccaccct 


ggcccctgaa 


gcgctcccct 


3551 


cccagggaga 


ggtggagagg 


gaaggggaaa 


gccctgcggc 


agggccccct 


3601 


ccccaggagc 


ttgttgaaga 


agagccctct 


Cctcccccaa 


ccctgttgga 


3651 


agaggagact 


gaggatggga 


gtgacaaggt 


gcagccccca 


ccagagacac 


3701 


ctgcagaaga 


agagatggag 


acagagacag 


aggccgaagc 


tctccaggaa 


3751 


aaggagcagg 


atgacacagc 


tgccatgctg 


gccgacttc^ 


tcgattgtcc 


3801 


ccctgatgat 


gagaagccac 


cacctcccac 


agagcctgac 


tcctagccat 


3851 


cttctgcacc 


ccacctcttt 


gtttccaata 


aagttatgtc 


cttaaaaaaa 


3901 


a 
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pl60dna-3 Length: 3211 Type: N Check: 

2308 

1 ggggcagccg ttctgagtgg gccctctgcg ggctccgcgg ctggggttcc 
51 tggcgggacc gggggtctct cggcagtgag ctcgggcccg cggctccgcc 
101 tgctgctgct ggagagtgtt tctggtttgc tgcaacctcg aacggggtct 
151 gccgttgctc cggtgcatcc cccaaaccgc tcggccccac atttgcccgg 
201 gctcatgtgc ctattgcggc tgcatgggtc ggtgggcggg gcccagaacc 
251 tttcagctct tggggcattg gtgagtctca gtaatgcacg Lctcagttcc 
301 atcaaaactc ggtttgaggg cctgtgtctg ctgtccctgc tggtagggga 
351 gagccccaca gagctattcc agcagcactg tgtgtcttgg cttcggagca 
401 ttcagcaggt gttacagacc caggacccgc ctgccacaat ggagctggcc 
451 gtggctgtcc tgagggacct cctccgatat gcagcccagc tgcctgcact 
501 gttccgggac atctccatga accacctccc tggccttctc acctccctgc 
551 tgggcctcag gccagagtgt gagcagtcag cattggaagg aatgaaggct 
601 tgtatgacct atttccctcg ggcttgtggt tctctcaaag gcaagctggc 
651 ctcatttttt ctgtctaggg tggatgcctt gagccctcag ctccaacagt 
701 tggcctgtga gtgttattcc cggctgccct ctttaggggc tggcttttcc 
751 caaggcctga agcacaccga gagctgggag caggagctac acagtctgct 
801 ggcctcactg cacaccctgc tgggggccct gtacgaggga gcagagactg 
851 ctcctgtgca gaatigaaggc cctggggtgg agatgctgct gtcctcagaa 
901 gatggtgatg cccatgtcct tctccagctt cggcagaggt tttcgggact 
951 ggcccgctgc ctagggctca tgctcagctc tgagtttgga gctcccgtgt 
1001 ccgtccctgt gcaggaaatc ctggatttca tctgccggac cctcagcgtc 
1051 agtagcaaga atattagctt gcatggagat ggtccctgcg gctgctgctg 
1101 ctgccctcta tccaccttga aggccttgga cctgctgtct gcactcatcc 
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2351 cagccctgga agaggatttg acagttatta atatcaacag cagtgatgaa 

24 01 gaggaggagg aagaaggaga agaggaagaa gaagaagaag aagaagaaga 

2451 ggaagaagaa gaagaggaag aagaggaaga ggaggaagac tttgaggaag 

2501 aggaagagga tgaagaggaa tattttgaag aggaagaaga ggaggaagaa 

2551 gagtttgagg aagaatttga ggaagaagaa ggtgagttag aggaagaaga 

2601 agaagaggag gatgaggagg aggaagaaga actggaagag gtggaagacc 

2651 tggagtttgg cacagcagga ggggaggtiag aagaaggtgc accaccaccc 

2701 ccaaccctgc ctccagctct gcctccccct gagtctcccc caaaggtgca 

2751 gccagaaccc gaacccgaac ccgggctgct tttggaagtg gaggagccag 

2801 ggacggagga ggagcgtggg gctgacacag ctcccaccct ggcccctgaa 

2 851 gcgctcccct cccagggaga ggtggagagg gaaggggaaa gccctgcggc 

2901 agggccccct ccccaggagc ttgLtgaaga agagccctct Cctcccr^a 

2951 ccctgttgga agaggagact gaggatggga gtgacaaggt gcagccccca 

3001 ccagagacac ctgcagaaga agagatggag acagagacag aggccgaagc 

3051 tctccaggaa aaggagcagg atgacacagc tgccatgctg gccgacttca 

3101 tcgattgtcc ccctgatgat gagaagccac cacctcccac agagcctgac 

3151 tcctagccat cttctgcacc ccacctcttt gtttccaata aagttatgtc 

3201 cttaaaaaaa a 
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pl60dna x pl60dna-3 

1 ggggcagccattctgagtgggccctctgcgggctccgcggctggggttcc 50 

I I I I I I 1 1 I T I I I I 1 1 I I I I I t t 1 I I I 1 1 I I I I I 1 1 1 1 I t 1 I I 1 I I t I 1 1 cn 

1 ggggcagccgttctgagtgggccctctgcgggctccgcggctggggttcc 50 

51 tqgcggaaccgggggtctcLcggcagtgagctcgggcccgcggcticcgcc 100 

lllllllllllllllllllllllllllllllMHIIIIIIMIIIMll 

51 tggcgggaccgggggtctctcyycagtgagctcgggcccgcggctccgcc 100 

101 toctgccgctgoagagtgtttctggtttgctgcaaccfccgaacggggtct 150 

| | I I I I I I I I [ T I I I I I I I I II I I I I I I I I I i I I I I I I I I 1 I I I i ! I I I I 

101 tgctgctgctggagcgtgtttctggtttgctgcaacctcgaacggggtct 150 



200 
200 



151 gccqttqctccagtgcatcccccaaaccgctcggccccacatttgcccgg 

1 I M I i I II I I T I I I I I I I I M II II I I I I I II I I I I I I I I I I I I I I I I I 

151 gccgttgctccggtgcatcccccaaaccgctcggccccacatttgcccgg 

201 gctcatgtgcctattgcggctgcatgggtcggtgggcggggcccagaacc 250 

I t I I I I I I I I I I I I I I I I 1 I I I I I I I I I I M M I I I I i II I I I I I I I 1 I I 
201 gctcatgtgcctattgcggctgcatgggtcggtgggcggggcccagaacc 250 

251 tttcagctcttggogcattggtgagtctcagtaatgcacgtctcagttcc 300 

I I I I I I I 1 I II I I T I I I I I I I I I II I I II I I I I M I I I I I 1 I I I 1 I M I I 
251 tttzcagctctuggggcattggtgagtctcagtaatgcacgtctcagttcc 300 

301 atcaaaactcggtttaagggcctgtgtctgctgtccctgctggtagggga 350 

I I I 1 I I I I I I I I II III I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I I 

301 atcaaaactcggtttgagggcctgtgtctgctgtccctgctggtagggga 350 

351 gagccccacagagctattccagcagcactgtgtgtcttggcttcggagca 400 

I 1 1 I M I I I M I I I I I I I I I I I I I I I I II I t I I I I I I I I I I I I I I I I I I I 
351 gagccccacagagctattccagcagcactgtgtgtcttggcttcggagca 400 

401 ttcagcaggtgttacagacccaggacccgcctgccacaktggagctggcc 450 

I M I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I [I I I I I I I I I I I I 
401 ttcagcaggtgttacagacccaggacccgcctgccacaptggagctggcc 450 



FIG. I8A 
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I 1 1 I 1 11 I II I I 1 I I I I I I I II I I I I 1 I I I I I I ! I I I I I I I I I I i I I I I I 

1066 agcttgcatggagatggtccctgcggctgctgctgctgccctctatccac 1115 

1551 ct:tgaaggccttggacctgctgtctgcactcatcctcgcgtgtggaagcc 1600 

I I I I I I I I I I I I I II II I ! i I M I I I I I I I I I I I I I I I I I I I ( II M I I I 
1116 cttgaaggccttggacctgctgtctgcactcatcctcgegtgtggaagcc 1165 

* • • * • 
1601 ggctcttgcgctttgggatcctgatcggccgcctgcttccccaggtcctc 1650 

I 1 I 1 I I I II I I I I 1 I I I I I I I [ I I I I I I I I I I I I I I I I I I I I [ I I | I I | 1 
1166 ggctcttgcgctttgggatcctgatcggccgcctgcttccccaggtcctc 1215 

* - • • • 
1651 aattcctggagcatcggtagagattccctctctccaggccaggagaggcc 1700 

II I I I I I II I I I I I I I I I I I I I I I I I I I I II 1 I I I I M I II I II I II I I I 
1216 aattcctggagcatcggtagagattccctctctccaggccaggagaggcc 1265 

* • • • 

1701 ttacagcacggttcggaccaaggbgtatgcgatattagagctgtgggtgc 1750 
1 I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I t I I II I I I I I I I I I I I 

12 6 6 ttacagcacggttcggaccaaggtgtatgcgatattagagctgtgggtgc 1315 

1751 aggtttgtggggcctcggcgggaatgcttcaggcaggagcctctggagag 1800 

I I I I I II I I I I 1 I I I II I I I I I I I II I I I I I i III I I I I | I | I I I I | | | | 
1316 aggtttgtggggccLcggcgggaatgcttcagggaggagcctctggagag 1365 

1801 gcccfcgctcacccacctgctcagcgacatctccccgccagctgatgccct: 1850 
I I I M I M I I I I II I I 1 I I! I I I I II I I M I I I II I I I I I I I I I I || I II 

13 66 gccctgctcacccacctgctcagcgacatctccccgccagctgatgccct 1415 

1851 taagctgcgtagcccgcgggggagccctgatgggagtttgcagactggga 1900 

I I M I I M I I I I i I I I I I I M I I I I I I I I 1 I I I | I | I I | | i | | 1 | || | | | 
1416 taagctgcgtagcccgcgggggagccctgatgggagtttgcagactggga 1465 

* • * * « 
1901 agcctagcgcccccaagaagchaaagctggatgtgggggaagctatgqcc 1950 

m M I I ] I I I I I I I I I M I M m M M m M M I 1 I 1 I I I I II I I j I 
1466 agcctagcgcccccaagaagctaaagctggatgtgggggaagctatggcc 1515 

* * » ♦ 

1951 ccgccaagccaccggaaaggggatagcaatgccaacagcgacgtgtqtcc 2000 
MINIM 

1516 ccgccaag 1523 



2201 gtctcctcgctgcccacctcctcttgcctgtgccccgcaagccttctccc 2250 

« I I I I I M I I I I I I | I | | | | | | | | | | | | | | | | | | | M 
i:3<i<1 ccacctcctcttgcctgtgccctgcaagccttctccc 1560 

2251 tcggccagcgagaagatagccttgaggtctcctctttcttgctcagaagc 2300 
lc . fil 1 1 > I I I 1 1 1 I I I I I I 1 f | | M I M I U I I I I 11 II I Ml U M I I I I I 
lb61 tcggccagcgagaagatagccttgaggtctcctctttcttgctcagaagc 1610 

2301 
1611 



1 1 * 1 1 JL 1 1 1 1 ' 1 1 1 1 1 1 1 1 1 1 I I I I II I I I I I I I I I I I I I I M I I I I 1 I I 

actggtgacctgtgctgctctgacccacccccgggttcctcccctgcagc 1660 

2351 ?mmni??^?t?T???^??n??Tf?hTW??t?? 2100 

FIG. I8C 
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16 61 ccatgggccccacctgccccacacctgctccagtccccctcctgaggccc 1710 

* 

2 4 01 catcgcccttcagggccccaccgttccatcctccgggccccatgccctca 2450 

I M I 1 1 I I I II I I I II II I I I! I I II I I M I I II I I I I I I I I [ I I I M I I 
1711 catcgcccttcagggccccaccgttccatcctccgggccccatgccctca 17 60 
- 

2451 gtgggctccatgccctcagcaggccccatgcccttcagcaggccccatgc 2500 

( M II i I I I 1 II I I I I I I I I M I I I I I 1 I I I I I I I I I I I I I I I I | | | | | | 
1761 gtgggctccatgccctcagcaggccccatgcccttcagcaggccccatgc 1810 
* 

2501 cctcagcaggccctgtgccctcggagccctggacctccaccacagccaac 2550 
M I I I 1 I I I I I I I I I I I I I I I I I I I I II I I I I I I I [ I I I I I I I | | | I I I I 

1B11 cctcagcaggccctgtgccctcggagccctggacctccaccacagccaac 1860 

2551 ctcctaggccttctgtccaqgcctagtgtctgtcctccccggcttcttcc 2600 

I 1 I I M I I I i I I I I I I I I I M I I II I I I I I I I I I I I I I II I | M I I I I I I 
1061 ctcctaggccttctgtccaggcctagtgtctgtcctccccggcttcttcc 1910 

2601 tggccctgagaaccaccgggcaggctcaaahgaggaccccatcctbgccc 2650 

I 1 I M I i I I N I M I I I I I 1 II I I I I I II I I I I 1 I I I I I I I I II I I | | | | 
1911 tggccctgagaaccaccgggcaggctcaaatgaggaccccatccttgccc 1960 
- 

2 651 ctagtgggactcccccacctactatacccccagatgaaacttttggggqg 2700 

I I M I M I I I I I | I | | I M M I I I I I I I | I I I II | | | | I | ) | | | | | Ml) 
1961 ctagtgggactcccccacctactatacccccagatgaaacttttgggggg 2010 

■ * - » 

2701 agagtgcccagaccagcctttgtccactatgacaaggayuciygcatctQa 2750 
^ nni I > ' M I I t t 1 f I [ E 1 I I I I I I I I I I I t I 1 I E I t t I I i I 1 1 f I I I I E I 1 I I 
2011 agagtgcccagaccagcctttgtccactatgacaaggaggaggcatctga 2060 

* • • * 

2751 tgtggagatctccttggaaagtgactctgatgacagcgtggtgatcgtqc 2800 
onri I I I' t I I I I I f I I I I I I I I I I I I I I I I I I I I I II I II I I I II | | | | | III 
2061 tgtggagatctccttggaaagtgactctgatgacagcgtggtgatcgtgc 2110 

2801 ccgaggggcttccccccchgccacccccaccaccctcaagtgccacacca 2850 

HlllllllllllMIIIIIIIIIIIMIIMIIMllTlllllllilll 
llll ccgaggggcttccccccctgccacccccaccaccctcaggtgccacacca 2160 

2851 ccccctatagcccccactgggccaccaacagcctcccctcctgtgccagc 2900 
'"MllllllllllllllimilllllMMIIIIIIIIIIIIIIIII 

Albl ccccctatagcccccactgggccaccaacagcctcccctcctgtgccagc 2210 

2901 Yft?????????^?^????"? Ctgcg ^ ccca ^99 cc t ctc ccgccgc 2950 

1 1 """" I I I I I I I IN I I | | | | I I I I I I I I I I | | I I I | | | | | | | | | 
^11 "saggaggagcctgaagaacttcctgcggccccagggcccctcccgccgc 2260 

29 51 ccccacctccgccgccgcctgttcchggt 

2261 ^cccacctccgccgccgcctgttcctggtcctgtgacnctccctccaccc 2310 

3001 ^gttggtccctgaaggg^ 3050 

2T11 llll -III 1 1 1 i 1 1 1 1 1 1 1 1 1 1 I I I t I I I I I t I I I I I I I I I | I I I I I I I I 
2311 cagtLggtccctgaagggactcctggtgggggaggacccccagccctgga 



3000 



2360 



3051 



I?T??m M??'f?? l r^^?^ atcaaca 5 ca 5 t 9 a tgaagaggaggagg 3100 
'''' 1 1 1 I ' 1 1 I ' I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I | 
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23 61 agaggatttgacagttattaatatcaacagcagtgatgaagaggaggagg 2410 

3101 aagaaggagaagaggaagaagaagaagaagaagaagaagaggaagaagaa 3150 
I | | I I | | | I I I I I I I I I I M I I I I I I I I I ! I I I I I M I I I I I I I II I I I ! 

2411 aagaaggagaagaggaagaagaagaagaagaagaagaagaggaagaagaa 24 GO 

■» • • • 

3151 gaagaggaagaagaggaagaggaggaagactttgaggaagaggaagagga 3200 
I | | I II I I I I I I I I I I I I I I I I I t I I I I M I I I I I I 1 I I II I M I I I I I I 

2461 gaagaggaagaagaggaagaggaggaagactttgaggaagaggaagagga 2510 

3201 tgaagaggaatattzttgaagaggaagaagaggaggaagaagagtttgagg 3250 
I I I I I I I I I I I I I ( I I I I I I I I I I II I I II I I II I I I I I II I I I II I II I 

2511 tgaagaggaatattttgaagaggaagaagaggaggaagaagagtttgagg 2560 
. • • » 

3251 aagaatttgaggaagaagaaggtgagttagaggaagaagaagaagaggag 3300 
I I II I I I I I I I I II I I I II I I I I U I I I 1 I I I I I I I I I I I I I I I I I I I I I 

2561 aagaatttgaggaagaagaaggtgagttagaggaagaagaagaagaggag 2 610 

• * » * 

33 01 gatgaggaggaggaagaagaactggaagaggtggaagacctggagtttgg 3 350 

I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I II I II I M I I II I I II I I 1 I I 
2611 gatgaggaggaggaagaagaactggaagaggtggaagacctggagtttgg 2 660 

3351 cacagcaggaggggaggtagaaaaaagtgcaccaccacccccaaccctgc 34 00 
I I I I I I I I I I I I I I I I 1 111 I II! Ill I II I II I I 1 I I I I I I I I I I I I I I 

2 661 cacagcaggaggggaggtagaagaaggtgcaccaccacccccaaccctgc 2710 

3401 ctccagctctgcctccccctgagtctcccccaaaggtgcagccagaaccc 3 4 50 

I I I I II I I I I I I I I I I I | 1 I I I M II I I I I I I I I M I I I I 1 I I | | I I I I | 
2711 ctccagctctgcctccccctgagtctcccccaaaggtgcagccagaaccc 27 60 

3451 gaacccgaacccgggctgct tttggaagtggaggagccagggacggagga 3 500 

I I I I I I I I I I I I I II I I I I I I I I I I I II II I I I I I M M I I I I I I I I I I I 
27 61 gaacccgaacccgggctgcttttggaagtggaggagccagggacggagga 2 810 

3501 ggagcgtggggctgacacagctcccaccctggcccctgaagcgctcccct 355 0 

I I I I II I I I II I i I I II I | | | I I I M I I I I I I I I I II II I I I I I II II I I 
2811 ggagcgtggggctgacacagctcccaccctggcccctgaagcgctcccct 2860 

3551 cccagggagaggtggagagggaaggggaaagccctgcggcagggccccct 3 600 
I I N I I I I I I I I I I II M I I I I I II I I I II I I I I 1 I I I I I I I 1 I I I I I II 

2861 cccagggagagguggagagggaaggggaaagccctgcggcagggccccct 2910 

3601 ccccaggagcttgttgaagaagagccctctnctcccccaaccctghtqqa 3650 

M II 1 M i li! I [Ml Hi M! I II I t( II II II I [ I I HI ||( i M! I I 
2911 ccccaggagcttgttgaagaagagccctctnctcccccaaccctgttgga 2960 
* 

3 651 agaggagactgaggatgggagtgacaaggtgcagcccccaccagagacac 37 00 

1 I I I [ I I M I I I I I I I 1 I I I I I I I I I i i I I I I I I I I I | | | | | M | | | | | | 
29 61 agaggagactgaggatgggagtgacaaggtgcagcccccaccagagacac 3010 

3701 ctgcagaagaagagatggagacagagacagaggccgaagctctccaggaa 3750 

^ ni1 " " I I I I I I I I I I I 1 I I I t | | | I | | | | | | m | | | | | , | | ! | , M mi I 

juii ctgcagaagaagagatggagacagagacagaggccgaagctctccaggaa 30 GO 

FIG. I8E 
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3061 aaggagcaggatgacacagetgccatgctggccgacttcatcgabtgtcc 3110 



30 01 ccctgatgatgagaagccaccacctcccacaqagcctgact 
I I I I I I I M I I I I I > I I I I I I I I 1 I I I I I I 1 I I I I I I I I I I 
ccctgatgatgagaagccaccacctcccacagagcctgactc 



3111 




3B50 
3160 



3851 cttctgcaccccacctctttgtttccaataaagttatgtccttaaaaaaa 3900 

I I I I I ! I I 1 i I I I I I I I I I I I I I I I I I I I M I I I I I I I 1 I I I I I I I I I I I 
3161 cttctgcaccccacctctttgtttccaataaagttatgtccttaaaaaaa 3210 

3901 a 3901 
I 

3211 a 3211 



FIG. I8F 
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P160.1 x pl60.2 



1 MELAVAVLRDLLRYAAQLPALFRDISMNHLPGLLTSLLGLRPECEQSALE 50 

II I I I I I I I I I I I 1 I I I I I I I I I I I I I i I I I I I I I II I I I I I | | I I I | f | 
1 MELAVAVLRDLLRYAAQLPALFRDISMMHLPGLLTSLLGL.RPECEQSALE 50 

51 GMKACMTYFPRACGSLKGKLASFFLSRVDALSPQLQQLACECY5RLPSLG 100 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I | I | | | I I I I 

51 GMKACKTYFPRACGSLKGKLASFFLSRVDALSPQLQQLACECYSRLPSLG 100 

101 AGFSQGLKHTESWEQELHSLLASLHTLLGALYEGAETAPVQKEGPGVEML 150 

I I I I H I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I II I I I I I I I I I 
101 AGFSQGLKHTESWEQELHSLLASLKTLLGALYEGAETAPVQNEGPGVEKL 150 

151 LSSFJDGDAHVLLQLRQRFSGLARCLGLMLSSEFGAPVSVPVQEILDFICR 200 

I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I | I I I I I 
151 LSSEDGDAHVLLQLRQRFSGLARCLGLMLSSEFGAPVSVPVQEILDFICR 200 

201 TLSVSSKNIVSGICHLFRALAQDTRQPGKYWGPESPQTVSSV7SPSQRAST 250 
I I I I I II I I 

2 01 TLSVSSKNI 209 

351 FFLQSLHGDGPCGCCCCPLSTLKALDLLSALILACGSRLLRFGILIGRLL 40 0 

9i n il'llllNlllllllllllllllMIIIIIIIIIIIIIIIIIIII 

210 SLHGDGPCGCCCCPLSTLKALCLLSALILACGSRLLRFGILIGRLL 255 
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